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THERAPEUTIC AND DIAGNOSTIC METHODS AND COMPOSITIONS 
BASED ON NOTCH PROTEINS AND NUCLEIC ACIDS 



This application is a continuation-in-part of copending application 
Serial No. 08/083,590 filed June 25, 1993, which is a continuation-in-part of both 
application Serial No.- 07/955 ,012 filed September 30, 1992, now abandoned, and 
copending application Serial No. 07/879,038 filed April 30, 1992. each of which 
is incorporated by reference herein in its entirety. 

This invention was made in part with government support under 
grant numbers GM 29093 and NS 26084 awarded by the National Institutes of 
Health. The government has certain rights in the invention. 

1. INTRODUCTION 
The present invention relates to therapeutic compositions 
comprising Notch proteins, analogs and derivatives thereof, antibodies thereto, 
nucleic acids encoding the Notch proteins, derivatives or analogs, Njjicj! antisense 
nucleic acids, and t oporythmic proteins which btndj oNotch and their nucleic 
acids and antibodies. Therapeutic and diagnostic methods are also provided. 

2. RAPKfiROUN n OF THF INVENTION 

2.1. THE NOTCH fiFNF. AND PROTEIN 
Null mutations in any one of the zygotic neurogenic loci - £IojcJi 
(N), Bella (DJ), mastermind (mam) . Enhan ce r of SpJil (E£spl), neuralized , (dsu), 
and big brain (b\b) -result in hypertrophy of the nervous system at the expense o 
ventral and lateral epidermal structures. This effect is due to the misrouiing of 
epidermal precursor cells into a neuronal pathway, and implies that neurogenic 
gene function is necessary to divert cells within the neurogenic region from a 
neuronal fate to an epithelial fate. Studies that assessed the effects of laser 
ablation of specific embryonic neuroblasts in grasshoppers (Doe and Goodman 
1985, Dev. Biol. Ill, 206-219) have shown that cellular interactions between 
neuroblasts and the surrounding accessory cells serve to inhibit these accessory 
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cells from adopting a neuroblast fate. Together, these genetic and developmental 
observations have led to the hypothesis that the protein products of the neurogenic 
loci funeinin as components of a cellular interaction mechanism necessary for j 
proper epidermal development (Artavanis-Tsakonas, 1988, Trends Genet. 4, 95- f 
S 100). 

Sequence analyses (Wharton et al., 1985, Cell 43, 567-581; Kidd 
et al., 1986, Mol. Cell. Biol. 6, 3094-3108; Vassin et al., 1987, EMBO J. 6, 
3431-3440; Kopczynski et al., 1988, Genes Dev. 2, 1723-1735) have shown that 
two of the neurogenic loci, Notch and Delta , appear to encode transmembrane 

10 proteins that span the membrane a single time. The Drosophila Notch gene 
encodes a -300 kd protein (we use "Notch" to denote this protein) with a large 
N-terminal extracellular domain that includes 36 epidermal growth factor (EGF)- 
like tandem repeats followed by three other cysteine-rich repeats, designated 
Notch /lin-12 repeats (Wharton et al., 1985, Cell 43, 567-581; Kidd et al., 1986, 

15 Mol. Cell Biol. 6, 3094-3108; Yochem et al., 1988, Nature 335, 547-550). The 
sequences of Xenopus (Coffman et al., 1990, Science 249:1438-1441) and a 
human Notch homolog termed TAN-] (EUisen et al., 1991, Cell 66:649-661) have 
also been reported. Delta encodes a - 100 kd protein (we use "Delta" to denote 
DLZM, the protein product of the predominant zygotic and maternal transcripts; 

20 Kopczynski et al., 1988, Genes Dev. 2, 1723-1735) that has nine EGF-like 

repeats within its extracellular domain (Vassin et al., 1987, EMBO J. 6, 3431- 
3440; Kopczynski et al., 1988, Genes Dev. 2, 1723-1735). Although little is 
known about the functional significance of these repeats, the EGF-like motif has 
been found in a variety of proteins, including those involved in the blood clotting 

25 cascade (Furie and Furie, 1988, Cell 53, 505-518). In particular, this motif has 
been found in extracellular proteins such as the blood clotting factors IX and X 
(Rees et al., 1988, EMBO J. 7, 2053-2061; Furie and Furie, 1988, Cell 53, SOS- 
SIS), in other Drosophila genes (Knust et al., 1987, EMBO J. 761-766; Romberg 
et al., 1988, Cell 55, 1047-1059), and in some cell-surface receptor proteins, 

30 such as thrombomodulin (Suzuki et al., 1987, EMBO J. 6, 1891-1897) and LDL 
receptor (Sudhof et al., 1985, Science 228, 815-822). A protein binding site has 
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been mapped to the EGF repeat domain in thrombomodulin and urokinase 
(Kurosawa et al., 1988, J. Biol. Chem 263, 5993-5996; Appella et a!., 1987, J. 
Biol. Chem. 262, 4437-4440). 

An intriguing array of interactions between Notch and Delia 
5 mutations has been described (Vassin, et al.. 1985, J. Neurogenet. 2, 291-308; 
Shepard et al., 1989, Genetics 122, 429-438; Xu et al., 1990, Genes Dev., 4, 
464-475). A number of genetic studies (summarized in Alton et al. , 1989, Dev. 
Genet. 10, 261-272) has indicated that the gene dosages of HsSS&L and IMa in 
relation to one another are crucial for normal development. A 50% reduction in 
10 the dose of Dgjta. in a wild-type Notch background causes a broadening of the 
wing veins creating a "delta" at the base (Lindsley aid Grell, 1968, Publication 
Number 627, Washington, D.C. Carnegie Institute of Washington). A similar 
phenotype is caused by a 50% increase in the dose of Notch in a wild-type Dejta 
background (a "Confiuens" phenotype; Welshons, 1965, Science 150, 1122- 
15 1129). This Delta phenotype is partially suppressed by a reduction in the £jgjch 
dosage. Work has shown that lethal interactions between alleles that correlate 
with alterations in the EGF-like repeats in Notch can be rescued by reducing the 
dose of Dejta. (Xu et al., 1990, Genes Dev. 4, 464-475). Xu et al. (1990, Genes 
Dev. 4, 464-475) found that null mutations at either Delta or mam suppress lethal 
20 interactions between heterozygous combinations of certain Notch alleles, known 
as the Abruptex (Ax) mutations. Ax alleles are associated with missense 
mutations within the EGF-like repeats of the Notch extracellular domain (Kelley 
et al., 1987, Cell 51, 539-548; Hartley et al., 1987, EMBO J. 6, 3407-3417). 

Recent studies have shown that Notch and Delta, and Notch and 
25 Serrate, directly interact on the molecular level (Fehon et al., 1990, Cell 61:523- 
534; Rebay et al., 1991, Cell 67:687-699). 

Notch is expressed on axonal processes during the outgrowth of 
embryonic neurons (Johansen et al., 1989, J. Cell Biol. 109:2427-2440; Kidd et 
al., 1989, Genes Dev. 3:1113-1129; Fehon et al., 1991, J. Cell Biol. 
30 113:657-669). 
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A study has shown that certain Ax alleles of Notch can severely 
alter axon pathfinding during sensory neural outgrowth in the imaginal discs, 
although it is not yet known whether aberrant Notch expression in the axon itself 
or the epithelium along which it grows is responsible for this defect (Palka et al., 
5 1990, Development 109, 167-175). 

2.2. CANCER 
A neoplasm, or tumor, is a neoplastic mass resulting from 
abnormal uncontrolled cell growth, which may cause swelling on the body 
10 surface, and which can be benign or malignant. Benign tumors generally remain 
localized. Malignant tumors are collectively termed cancers. The term 
"malignant" generally means that the tumor can invade and destroy neighboring 
body structures and spread to distant sites to cause death (for review, see Robbins 
and Angell, 1976, Basic Pathology, 2d Ed., W.B. Saunders Co., Philadelphia, 
IS pp. 68-122). 

Effective treatment and prevention of cancer remains a long-felt 
need, and a major goal of biomedical research. 

3. SUMMARY O F THF. INVENTION 
20 The present invention relates to therapeutic and diagnostic methods 

and compositions based on Notch proteins and nucleic acids. The invention 
provides for treatment of disorders of cell fate or differentiation by administration 
of a therapeutic compound of the invention. Such therapeutic compounds (termed 
herein "Therapeutics") include: Notch proteins and analogs and derivatives 
25 (including fragments) thereof; antibodies thereto; nucleic acids encoding the 

Notch proteins, analogs, or derivatives; Notch antisense nucleic acids; as well as 
toporythmic proteins and derivatives which bind to or ofoerwise interact with 
Notch proteins, and thejr e ncoding nucleic acids and antibodies. In a preferred 
embodiment, a Therapeutic of the invention is administered to treat a cancerous 
30 condition, or to prevent progression from a pre-neoplastic or non-malignant state 
into a neoplastic or a malignant state. In other specific embodiments, a 
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Therapeutic of the invention is administered to treat a nervous system disorder or 
to promote tissue regeneration and repair. 

In one embodiment, Therapeutics which antagonize, or inhibit, 
Notch function (hereinafter "Antagonist Therapeutics") are administered for 
5 therapeutic effect; disorders which can be thus treated can be identified by in vitro 
assays such as described in Section 5.1, infra. Such Antagonist Therapeutics 
include but are not limited to Notch antisense nucleic acids, anti-Notch 
neutralizing antibodies, and competitive inhibitors of Notch protein-protein 
interactions (e.g., a protein comprising Notch ELR-11 and ELR-12 and 
10 derivatives thereof), all as detailed infra. 

In another embodiment, Therapeutics which promote Notch 
function (hereinafter "Agonist Therapeutics ") are administered for therapeutic 
effect; disorders which can thus be treated can be identified by in vitro assays 
such as described in Section 5.1, infra. Such Agonist Therapeutics include but 
15 are not limited to Notch proteins and derivatives thereof comprising the 
intracellular domain, and proteins that interact with Notch {e.g., a protein 
comprising a Delta sequence homologous to Drosophila Delta amino acids 1-230 
(see Figure 1 and SEQ ID NO:2), or comprising a Serrate sequence homologous 
to Drosophila Serrate amino acids 79-282 (see Figure 5 and SEQ ID NO:4)). 
20 Disorders of cell fate, in particular hyperproliferative (e.g. , 

cancer) or hypoproliferative disorders, involving aberrant or undesirable levels of 
expression or activity of Notch protein be diagnosed by detecting such levels, 
as described more fully infra. 

In a preferred aspect, a Therapeutic of the invention is a protein 
25 1 .consisting of at least a frag 1 ?nt (termed herein "adhesive fragment") of the 

proteins encoded by toporythmic genes which mediates binding to Notch proteins 
or adhesive fragments thereof. Toporythmic genes, as used herein, shall mean 
the genes Notch . Delta , and Serrate , as well as other members of the 
Delta / Serrate family which may be identified by virtue of sequence homology or 
30 genetic interaction, and in general, members of the "Notch cascade" or the 
"Notch group" of genes, which are identified by molecular interactions (e.g., 
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binding in vitro) or genetic interactions (as detected phenotypically, e.g., in 
Drosophila). 

In another aspect, the invention is directed to human Notch 
proteins; in particular, that encoded by the hN homolog, and proteins comprising 
5 the extracellular domain of the protein and subsequences thereof. Nucleic acids 
encoding the foregoing, and recombinant cells are also provided. 

3.1. DEFINITIONS 
As used herein, the following terms shall have the meanings 

10 indicated: 

AA = amino acid 
EGF = epidermal growth factor 
ELR = EGF-like (homologous) repeat 
IC = intracellular 
15 pcr = polymerase chain reaction 

As used herein, underscoring the name of a gene shall indicate the 
gene, in contrast to its encoded protein product which is indicated by the name of 
the gene in the absence of any underscoring. For example, "Notch" shall mean 
the Notch gene, whereas "Notch" shall indicate the protein product of the £Jotch 
20 gene. 

4. pFSrRTP™** nF THE FIGURES 
Figure 1. Primary Nucleotide Sequence of the DeJia cDNA Dll 
(SEQ ID NO:l) and Delta amino acid sequence (SEQ ID NO:2). The DNA 
25 sequence of the 5'-3' strand of the Dll cDNA is shown, which contains a number 
of corrections in comparison to that presented in Kopczynkski et al. (1988, Genes 

Dev. 2:1723-1735). 

Figure 2. Notch Expression Constructs and the Deletion Mapping 
of the Delta/Serrate Binding Domain. S2 cells in log phase growth were 
30 transiently transfected with the series of expression constructs shown; the 

drawings represent the predicted protein products of the various NjrtcJi deletion 
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mutants created. All expression constructs were derived from construct #1 
pMtNMg. Transiently transfected cells were mixed with Delta expressing cells 
from the stably transformed line U9-6-7 or with transiently transfected Serrate 
expressing cells, induced with CuS0 4 , incubated under aggregation conditions and 
5 then scored for their ability to aggregate using specific antisera and 

immunofluorescence microscopy. Aggregates were defined as clusters of four or 
more cells containing both Notch and Delta/Serrate expressing cells. The values 
given for % Aggregation refer to the percentage of all Notch expressing cells 
found in such clusters- either with Delta (Dl) (left column) or with Serrate (Ser) 
10 (right column). The various Notch deletion constructs are represented 

diagrammatically with splice lines indicating the ligation junctions. Each EGF 
repeat is denoted as a stippled rectangular box and numbers of the EGF repeats 
on either side of a ligation junction are noted. At the ligation junctions, partial 
EGF repeats produced by the various deletions are denoted by open boxes and 
IS dosed brackets (for example see #23 ACla+EGF(10-12)). Constructs #3-13 
represent the Clal deletion series. As diagrammed, four of the Clal sites, in 
repeats 7, 9, 17 and 26, break the repeat in the middle, immediately after the 
third cysteine (denoted by open box repeats; see Figure 3 for further 
clarification), while the fifth and most 3' site breaks neatly between EGF repeats 
20 30 and 31 (denoted by closed box repeat 31; again see Figure 3). In construct 
#15 split, EGF repeat 14 which carries the se!U point mutation, is drawn as a 
striped box. In cons'-ict #33 ACla+XEGF(10-13), the Xenopus Notch derived 
EGF repeats are distinguished from Drosophila repeats by a different pattern of 
shading. SP, signal peptide; EGF, epidermal growth factor repeat; N f liafcll/lin- 
25 12 repeat; TM, transmembrane domain; cdclO, cdclO/ankyrin repeats; PA, 
putative nucleotide binding consensus sequence; opa, polyglutamine stretch 
termed opa; Dl, Delta; Ser, Serrate. 

Figure 3. Detailed Structure of Notch Deletion Constructs #19-24: 
Both EGF Repeats 11 and 12 are Required for Notch-Delta Aggregation. EGF 
30 repeats 10-13 are diagrammed at the top showing the regular spacing of the six 
cysteine residues (C). PCR products generated for these constructs (names and 
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numbers as given in Figure 2) are represented by the heavy black lines and the 
exact endpoints are noted relative to the various EGF repeats. Ability to 
aggregate with Delta is recorded as (+) or (-) for each construct. The PCR 
fragments either break the EGF repeats in the middle, just after the third cysteine 

5 in the same place as four out of the five Clal sites, or exactly in between two 
repeats in the same place as the most C-terminal Clal site. 

Figure 4. Comparison of Amino Acid Sequence of EGF Repeats 
11 and 12 from Drosophila and Xenopus Notch. The amino acid sequence of 
EGF repeats 11 and 12 of Drosophila Notch (SEQ ID NO: 14) (Wharton et al., 

10 1985, Cell 43:567-581; Kidd et al., 1986, Mol. Cell Biol. 6:3094-3108) is 
aligned with that of the same two EGF repeats from Xenopus Notch (SEQ ID 
NO:15) (Coffman et al.. 1990, Science 249:1438-1441). Identical amino acids 
are boxed. The six conserved cysteine residues of each EGF repeat and the Ca ++ 
binding consensus residues (Rees et al., 1988, EMBO J. 7:2053-2061) are 

15 marked with an asterisk (*). The leucine to proline change found in the Xenopus 
PCR clone that failed to aggregate is noted underneath. 

Figure 5. Nucleic Acid Sequence Homologies Between Serrate 
and Delta. A portion of the Drosophila Serrate nucleotide sequence (SEQ ID 
NO:3), with the encoded Serrate protein sequence (SEQ ID NO:4) written below 

20 (Fleming et al., 1990, Genes & Dev. 4:2188-2201 at 2193-94) is shown. The 
four regions showing high sequence homology with the Drosophila Delta 
sequence are numbered above the line and indicated by brackets. The total region 
of homology spans nucleotide numbers 627 through 1290 of the Senate nucleotide 
sequence (numbering as in Figure 4 of Fleming et al., 1990, Genes & Dev. 

25 4:2188-2201). 

Figure 6. Schematic Diagram of Human NjjjcJi Clones. A 
schematic diagram of human Notch is shown. Heavy bold-face lines below the 
diagram show that portion of the Notch sequence contained in each of the four 
cDNA clones. The location of the primers used in PCR, and their orientation, 
30 are indicated by arrows. 
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Figure 7. Human Notch Sequences Aligned with Drosophila 
Notch Sequence. Numbered vertical lines correspond to Drosophila No^h 
coordinates. Horizontal lines below each map show where clones lie relative to 
stretches of sequence (thick horizontal lines). 

5 Figure 8. Nucleotide Sequences of Human Notch Contained in 

Plasmid cDNA Clone hN2k. Figure 8A: The DNA sequence (SEQ ID NO:5) of 
a portion of the human Notch insert is shown, starting at the EcoRI site at the 3' 
end, and proceeding in the 3' to 5' direction. Figure 8B: The DNA sequence 
(SEQ ID NO:6) of a portion of the human Notch insert is shown, starting at the 

10 EcoRI site at the 5' end, and proceeding in the 5' to 3' direction. Figure 8C: 
The DNA sequence (SEQ ID NO:7) of a portion of the human NQtgh insert is 
shown, starting 3' of the sequence shown in Figure 8B, and proceeding in the 5' 
to 3' direction. The sequences shown are tentative, subject to confirmation by 
determination of overlapping sequences. 

15 Figure 9. Nucleotide Sequences of Human Notch Contained in 

Plasmid cDNA clone hN4k; Figure 9A: The DNA sequence (SEQ ID NO:8) of 
a portion of the human Notch insert is shown, starting at the EcoRI site at the 5' 
end, and proceeding in the 5' to 3' direction. Figure 9B: The DNA sequence 
(SEQ ID NO:9) of a portion of the human Notch insert is shown, starting near 

20 the 3' end, and proceeding in the 3' to 5' direction. The sequences shown are 
tentative, subject to confirmation by determination of overlapping sequences. 

Figure 10. DNA (SEQ ID NO: 10) and Amino Acid (SEQ ID 
NO: 11) Sequences of Human Notch Contained in Plasmid cDNA Clone hN3k. 

Figure 11. DNA (SEQ ID NO: 12) and Amino Acid (SEQ ID 

25 NO: 13) Sequences of Human Notch Contained in Plasmid cDNA Clone hN5k. 

Figure 12. Comparison of hN5k With Other Notch Homologs. 
Figure 12A. Schematic representation of Drosophila Notch. Indicated are the 
signal sequence (signal), the 36 EGF-like repeats, the three NoKh/jjn-12 repeats, 
the transmembrane domain (TM), the six CDC10 repeats, the OPA repeat, and 

30 the PEST (proline, glutamic acid, serine, threonine)-rich region. Figure 12B. 
Alignment of the deduced amino acid sequence of hN5k with sequences of other 
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Notch homologs. Amino acids are numbered on the left side. The cdclO and 
PEST-rich regions are both boxed, and individual cdclO repeats are marked. 
: Amino acids which are identical in three or more sequences are highlighted. The 
primers used to clone hN5k are indicated below the sequences from which they 

5 were designed. The nuclear localization sequence (NLS), casein kinase II (CKII), 
and cdc2 kinase (cdc2) sites of the putative CcN motif of the vertebrate Notch 
homologs are boxed. The possible bipartite nuclear targeting sequence (BNTS) 
and proximal phosphorylation sites of Drosophila Notch are also boxed. 

Figure 13. Aligned amino acid sequences of Notch proteins of 

10 various species. humN: the human Notch protein encoded by the hN homolog 
(contained in part in plasmid hN5k) (SEQ ID NO: 19). TAN-1: the human Notch 
protein encoded by the TAN-1 homolog (SEQ ID NO:20) (the sequence shown is 
derived partly from our own work and partly from the TAN-1 sequence as published 
by Ellisen et al., 1991, Cell 66:649-661); Xen N: Xenopus Notch protein (Coffman 

15 et al.. 1990, Science 249:1438-1441). Dros N: Drosophila Notch protein 
(Wharton et al., 1985, Cell 43:567-581). Structural domains are indicated. 

Figure 14. immunocytochemical staining of breast cancer tissue 
from a human patient. Malignant breast tissue in a sample obtained from a 
human patient was embedded in a paraffin section, and subjected to 

20 immunocytochemical staining with anti-human Notch monoclonal antibody P4, 
directed against the TAN-1 protein. Non-malignant breast tissue exhibited much 

less staining (not shown). 

Figure 15. Immunocytochemical staining of colon tissue from a 
human patient with colon cancer. A colon tissue sample obtained from a patient 
25 with colon cancer was embedded in a paraffin section, and subjected to 

immunocytochemical staining with anti-human Notch monoclonal antibody PI, 
directed against the hN-encoded protein. Areas of increased staining are those 
areas in which malignant cells are present, as determined by cell morphology. 

Figure 16. Immunocytochemical staining of cervical tissue. 
30 Human tissue samples were obtained, containing cancer of the cervix (Fig. 16A) 
or normal cervical epithelium (Fig. 16B) from the same patient, embedded in a 
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paraffin section, and subjected to immunocytochemical staining with anti-human 
Wotch monoclonal anybody directed against the TAN-1 protein. Areas containing 
malignant cells (as determined by morphology) exhibited increasing staining 
relative to non-malignant cells. Among non-malignant cells, connective tissue and 
5 the basal layer of the epithelium (containing stem cells) stained with the anti- 

Notch antibody. 

Figure 17. DNA (SEQ ID NO:21) and encocV 1 amino acid 
sequence (contained in SEQ ID NO: 19) of human Notch homdlog hN. The entire 
DNA coding sequence is presented (as well as noncoding sequence), with the 
10 exclusion of that encoding the initiator Met. The last 8 nucleotides shown 
(numbers 9716-9723) are vector, and not hN, sequences. 

5. DETAILED DESCRIPTION O F THE INVENTION 
The present invention relates to therapeutic and diagnostic methods 
15 and compositions based on Notch proteins and nucleic acids. The invention 

provides for treatment of disorders of cell fate or differentiation by administration 
of a therapeutic compound of the invention. Such therapeutic compounds (termed 
herein "Therapeutics") include: Notch proteins and analogs and derivatives 
(including fragments) thereof; antibodies thereto; nucleic acids encoding the 
20 Notch proteins, analogs, or derivatives; Notch antisense nucleic acids; as well as 
toporythmic proteins and derivatives and analogs thereof which bind to or 
otherwise interact with Notch proteins, and their encoding nucleic acids and 
I antibodies. Also included are proteins and derivatives and analogs thereof which 
I are capable of inhibiting the interactions of a Notch protein with another 
25 I toporythmic protein (e.g. Delta, Serrate). In a preferred embodiment, a 

[ Therapeutic of the invention is administered to treat a cancerous condition, or to 
prevent progression from a pre-neoplastic or non-malignant state (e.g., 
metaplastic condition) into a neoplastic or a malignant state. In another specific 
i embodiment, a Therapeutic of the invention is administered to treat a nervous 
30 ' system disorder, such as nerve injury or a degenerative disease. In yet another 
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. specific embodiment, a Therapeutic of the invention is administered to promote 
! tissue regeneration and repair for treatment of various conditions. 
1 In one embodiment, Therapeutics which antagonize, or inhibit, 

\ Notch function (hereinafter "Antagonist Therapeutics") are administered for 
5 1 therapeutic effect; disorders which can be thus treated can be identified by in vitro 
\ assays such as described in Section 5.1, infra. Such Antagonist Therapeutics 
1 include but are not limited to Notch antisense nucleic acids, anti-Notch 
neutralizing antibodies, competitive inhibitors of Notch protein-protein 
interactions (e.g., a protein comprising Notch ELR-11 and ELR-12), and 

10 ; molecules which interfere with notch intracellular function such as that mediated 

i 

I by the cdclO repeats, as detailed infra. 

In another embodiment, Therapeutics which promote Notch 
! function (hereinafter "Agonist Therapeutics") are administered for therapeutic 
! effect; disorders which can thus be treated can be identified by in vitro assays 
15 SU ch as described in Section 5.1, infra. Such Agonist Therapeutics include but 
; are not limited to Notch proteins and derivatives thereof comprising the 
j intracellular domain, Notch nucleic acids encoding the foregoing, and proteins 
comprising toporythmic protein domains that interact with Notch (e.g., a protein 
comprising an extracellular domain of a Delta protein or a Delta sequence 
20 homologous to Drosophlla Delta amino acids 1-230 (see Figure 1 and SEQ ID 
i NO:2), or comprising a Serrate sequence homologous to Drosophila Serrate 
amino acids 79-282 (see Figure 5 and SEQ ID NO:4)). 

Disorders of cell fate, in particular precancerous conditions such as 
metaplasia and dysplasia, and hyperproliferative (e.g. , cancer) or 
25 hypoproliferative disorders, involving aberrant or undesirable levels of expression 
or activity of Notch protein can be diagnosed by detecting such levels, as 
described more fully infra. 

In a preferred aspect, a Therapeutic of the invention is a protein 
consisting of at least a fragment (termed herein "adhesive fragment") of the 
30 proteins encoded by toporythmic genes which mediates binding to Notch proteins 
or adhesive fragments thereof. Toporythmic genes, as used herein, shall mean 
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i the genes Notch . Delta , and Serrate , as well as other members of the 
I peJia/Ssnais family which may be identified by virtue of sequence homology or 
genetic interaction, and, more generally, members of the "Notch cascade" or the 
"Notch group" of genes, which are identified by molecular interactions (e.g., 
binding in vitro) or genetic interactions (as detected phenotypically, e.g., in 
Drosophlla). 

''■ The invention further provides a human Notch protein encoded by 

the hN homolog, and proteins comprising the extracellular domain of the Notch 
protein and subsequences thereof. Nucleic acids encoding the foregoing, and 
10 recombinant cells are also provided. 

For clarity of disclosure, and not by way of limitation, the detailed 
description of the invention is divided into the following subsections: 

(i) Therapeutic Uses; 

(ii) Prophylactic Uses; 

15 (Hi) Demonstration of Therapeutic or Prophylactic Utility; 

(iv) Therapeutic/Prophylactic Administration and Compositions; 

(v) Antisense Regulation of Notch Expression; 

(vi) Diagnostic Utility; 

(vii) Notch Nucleic Acids; 

20 (viii) Recombinant Production of Protein Therapeutics; 

(ix) Derivatives and Analogs of Notch and Other Toporythmic 
Proteins; 

(x) Assays of Notch Proteins, Derivatives and Analogs; and 

(xi) Antibodies to Notch Proteins, Derivatives and Analogs. 

25 

5.1. fHP.RAPF.imC uszs 
As stated supra, the Antagonist Therapeutics of the invention are 
those Therapeutics which antagonize, or inhibit, a Notch function. Such 
Antagonist Therapeutics are most preferably identified by use of known 
30 convenient in vitro assays, e.g. , based on their ability to inhibit binding of Notch 
to other proteins (see Sections 6-8 herein), or inhibit any known Notch function 
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as assayed in vitro, although genetic assays (e.g., in Drosophila) may also be 
employed. In a preferred embodiment, the Antagonist Therapeutic is a protein or 
derivative thereof comprising a functionally active fragment such as an adhesive 
fragment of Notch. In specific embodiments, such an Antagonist Therapeutic 

5 may be those adhesive proteins encoded by the appropriate constructs described in 
Sections 6 and 7 infra, or proteins comprising the Notch extracellular region, in 
particular ELR-11 and ELR-12, or an antibody thereto, or an analog/competitive 
inhibitor of a Notch intracellular signal-transducing region, a nucleic acid capable 
of expressing a Notch adhesive fragment, or a Notch antisense nucleic acid (see 

10 Section 5.5 herein). It should be noted tha* m certain instances, a Notch adhesive 
fragment (or possibly other presumed Antagonist Therapeutics) may alternatively 
act as an Agonist Therapeutic, depending on the developmental history of the 
tissue being exposed to the Therapeutic; preferably, suitable in vitro or in vivo 
assays, as described infra, should be utilized to determine the effect of a specific 

15 Therapeutic and whether its administration is indicated for treatment of the 
affected tissue. 

In another embodiment of the invention, a nucleic acid containing 
a portion of a Notch gene is used, as an Antagonist Therapeutic, to promote 
Notch inactivation by homologous recombination (Koller and Smithies, 1989, j 
20 Proc. Natl. Acad. Sci. USA 86:8932-8935; Zijlstra et al., 1989, Nature 342:435- j 

438). ; 

The Agonist Therapeutics of the invention, as described supra, j 
promote Notch function. Such Agonist Therapeutics include but are not limited 
to proteins and derivatives comprising the portions of toporythmic proteins such 

25 as Delta or Serrate that mediate binding to Notch, anc nucleic acids encoding the 
foregoing (which can be administered to express their encoded products in vivo). 
In a specific embodiment, such a portion of Delta is D. melanogaster Delta amino 
acids 1-230 (SEQ ID NO:l) or a portion of a human Delta most homologous 
thereto. In another specific embodiment, such a portion of Serrate is D. ' j 

30 melanogaster Serrate amino acids 79-282 (SEQ ID NO:5), or a portion of a | 
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human Serrate most homologous thereto. In other specific embodiments, such a 
portion of Delta or Serrate is the extracellular portion of such protein. 

Further descriptions and sources of Therapeutics of the inventions 
are found in Sections 5.4 through 5.8 herein. 

5 The Agonist and Antagonist Therapeutics of the invention have 

therapeutic utility for disorders of cell fate. The Agonist Therapeutics are 
administered therapeutically (including prophylactically): (1) in diseases or 
disorders involving an absence or decreased (relative to normal, or desired) levels 
of Notch function, for example, in patients where Notch protein is lacking, 

10 genetically defective, biologically inactive or underactive, or underexpressed; and 
(2) in diseases or disorders wherein in vitro (or in vivo) assays (see infra) indicate 
the utility of Notch agonist administration. The absence or decreased levels in 
Notch function can be readily detected, e.g., by obtaining a patient tissue sample 
(e.g., from biopsy tissue) and assaying it in vitro for protein levels, structure 

15 and/or activity of the expressed Notch protein. Many methods standard in the an 
can be thus employed, including but not limited to immunoassays to detect and/or 
visualize Notch protein (e.g.. Western blot, immunoprecipitation followed by 
sodium dodecyl sulfate poly aery lamide gel electrophoresis, immunocytochemistry, 
etc.; see also those assays listed in Section 5.6, infra), and/or hybridization assays 

20 to detect Notch expression by detecting and/or visualizing Notch mRNA (e.g. , 
Northern assays, dot blots, in situ hybridization, etc.) 

In vit, ^ assays which can be used to determine whether 
administration of a specific Agonist Therapeutic or Antagonist Therapeutic is 
indicated, include in vitro cell culture assays in which a patient tissue sample is 

25 g roW n in culture, and exposed to or otherwise administered a Therapeutic, and 
the effect of such Therapeutic upon the tissue sample is observed. In one 
embodiment, where the patient has a malignancy, a sample of cells from such 
malignancy is plated out or grown in culture, and the cells are then exposed to a 
Therapeutic. A Therapeutic which inhibits survival or growth of the malignant 

30 cells (e.g. , by promoting terminal differentiation) is selected for therapeutic use in 
vivo. Many assays standard in the art can be used to assess such survival and/or 
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growth; for example, cell proliferation can be assayed by measuring J H-thymidine 
incorporation, by direct cell count, by detecting changes in transcriptional activity 
of known genes such as proto-oncogenes (cg.Jos. myc) or cell cycle markers; 
cell viability can be assessed by trypan blue staining, differentiation can be 

5 assessed visually based on changes in morphology, etc. In a specific aspect, the 
malignant cell cultures' are separately exposed to (1) an Agonist Therapeutic, and 
(2) an Antagonist Therapeutic; the result of the assay can indite which type of 
Therapeutic has therapeutic efficacy. 

In another embodiment, a Therapeutic is indicated for use which 

10 exhibits the desired effect, inhibition or promotion of cell growth, upon a patient 
cell sample from tissue having or suspected of having a hyper- or 
hypoproliferative disorder, respectively. Such hyper- or hypoproliferative 
disorders include but are not limited to those described in Sections 5.1.1 through 
5.1.3 infra. 

15 In another specific embodiment, a Therapeutic is indicated for use 

in treating nerve injury or a nervous system degenerative disorder (see Section 
5.1.2) which exhibits in vitro promotion of nerve regeneration/neurite extension 
from nerve cells of the affected patient type. 

In addition, administration of an Antagonist Therapeutic of the 

20 invention is also indicated in diseases or disorders determined or known to 
involve a Notch dominant activated phenotype ("gain of function" mutations.) 
Administration of an Agonist Therapeutic is indicated in diseases or disorders 
determined or known to involve a Notch dominant negative phenotype ("loss of 
function" mutations). We have investigated the functions of various structural 

25 domains of the Notch protein in vivo, by ectopically expressing a series of 

Drosophila Notch deletion mutants under the hsp70 heat-shock promoter, as well 
as eye-specific promoters. Two classes of dominant phenotypes were observed, 
one suggestive of Notch loss-of function mutations and the other of Notch gain-of- 
ftinction mutations. Dominant "activated" phenotypes resulted from 
30 overexpression of a protein lacking most extracellular sequences, while dominant 
"negative" phenotypes resulted from overexpression of a protein lacking most 
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intracellular sequences. Our results indicate that Notch functions as a receptor 
whose extracellular domain mediates ligand-binding, resulting in the transmission 
of developmental signals by the cytoplasmic domain. The phenotypes observed 
also suggested that the cdclO/ankyrin repeat region within the intracellular domain 
5 plays an essential role in Notch mediated signal transduction events (intracellular 
function). 

In various specific embodiments, in virro assays can be carried out 
with representative cells of cell types involved in a patient's disorder, to 
determine if a Therapeutic has a desired effect upon such cell types. 

10 In another embodiment, cells of a patient tissue sample suspected 

of being pre-neoplastic are similarly plated out or grown in vitro, and exposed to 
a Therapeutic. The Therapeutic which results in a cell phenotype that is more 
normal (i.e., less representative of a pre-neoplastic state, neoplastic state, 
malignant state, or transformed phenotype) is selected for therapeutic use. Many 

15 assays standard in the art can be used to assess whether a pre-neoplastic state, 
neoplastic state, or a transformed or malignant phenotype, is present (see Section 
5.2.1). For example, characteristics associated with a transformed phenotype (a 
set of in vitro characteristics associated with a tumorigenic ability in vivo) include 
a more rounded cell morphology, looser substratum attachment, loss of contact 

20 inhibition, loss of anchorage dependence, release of proteases such as 

plasminogen activator, increased sugar transport, decreased serum requirement, 
expression of fetal antigens, disappearance of the 250,000 dalton surface protein, 
etc. (see Luria et al., 1978, General Virology, 3d Ed., John Wiley & Sons, New 
York pp. 436-446). 

25 In other specific embodiments, the in vitro assays described supra 

can be carried out using a cell line, rather than a cell sample derived from the 
specific patient to be treated, in which the cell line is derived from or displays 
characteristic(s) associated with the malignant, neoplastic or pre-neoplastic 
disorder desired to be treated or prevented, or is derived from the neural or other 

30 cell type upon which an effect is desired, according to the present invention. 
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The Antagonist Therapeutics are administered therapeutically 
(including prophylactically): (1) in diseases or disorders involving increased 
(relative to normal, or desired) levels of Notch function, for example, where the 
Notch protein is overexpressed or overactive; and (2) in diseases or disorders 
5 wherein in vitro (or in vivo) assays indicate the utility of Notch antagonist 

administration. The increased levels of Notch function can be readily detected by 
methods such as those described above, by quantifying protein and/or RNA. In 
vitro assays with cells of patient tissue sample or the appropriate cell line or cell 
type, to determine therapeutic utility, can be carried out as described above. 

10 

5.1.1. MALIGNANCIES 
Malignant and pre. neoplastic conditions which can be tested as 
described supra for efficacy of intervention with Antagonist or Agonist 
Therapeutics, and which can be treated upon thus observing an indication of 
15 therapeutic utility, include but are not limited to those described below in Sections 

5.1.1 and 5.2.1. 

Malignancies and related disorders, cells of which type can be 
tested in vitro (and/or in vivo), and upon observing the appropriate assay result, 
treated according to the present invention, include but are not limited to those 
20 listed in Table 1 (for a review of such disorders, see Fishman et al., 1985, 
Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia): 



TABLE 1 

25 MALIGNANCIES AND RELA TFD DISORDERS 

Leukemia 

acute leukemia 

acute lymphocytic leukemia 
acute myelocytic leukemia 
myeloblasts 
promyelocytic 
myelomonocytic 
monocytic 
erythroleukemia 
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chronic leukemia 

chronic myelocytic (granulocytic) leukemia 

chronic lymphocytic leukemia 
Polycythemia vera 
Lymphoma 

Hodgkin's disease 
5 non-Hodgkin's disease 

Multiple myeloma 
Waldenstrom's macroglobulinemia 
Heavy chain disease 
Solid tumors 

sarcomas and carcinomas 

fibrosarcoma 

myxosarcoma 

liposarcoma 

chondrosarcoma 

osteogenic sarcoma 

chordoma 

angiosarcoma 

endotheliosarcoma 

iymphangiosarcoma 
IS lymphangioendotheliosarcoma 

synovioma 

mesothelioma 

Ewing's tumor 

leiomyosarcoma 

rhabdomyosarcoma 

colon carcinoma 
2Q pancreatic cancer 

breast cancer 

ovarian cancer 

prostate cancer 

squamous cell carcinoma 

basal cell carcinoma 

adenocarcinoma 

sweat gland carcinoma 
25 sebaceous gland carcinoma 

papillary carcinoma 

papillary adenocarcinomas 

cystadenocarcinoma 

medullary carcinoma 

bronchogenic carcinoma 

renal cell carcinoma 
-q hepatoma 

bile duct carcinoma 

choriocarcinoma 
seminoma 
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embryonal carcinoma 

Wilms' tumor 

cervical cancer 

testicular tumor 

lung carcinoma 

small cell lung carcinoma 

5 bladder carcinoma 

epithelial carcinoma 
glioma 
astrocytoma 
medulloblastoma 
craniopharyngioma 
ependymoma 

- ft pinealoma 

hemangioblastoma 

acoustic neuroma 

oligodendroglioma 

menangioma 

melanoma 

neuroblastoma 

retinoblastoma 

15 



In specific embodiments, malignancy or dysproliferative changes 
(such as metaplasias and dysplasias) are treated or prevented in epithelial tissues 
such as those in the cervix, esophagus, and lung. 

20 

As detailed in the examples section 10.1 infra, malignancies of the 
breast, colon, and cervix exhibit increased expression of human Notch relative to 
such non-malignant tissue. Thus, in specific embodiments, malignancies of the 
breast, colon, or cervix are treated or prevented by administering an effective 
amount of an Antagonist Therapeutic of the invention. The presence of increased 
Notch expression in breast, colon, and cervical cancer suggests that many more 
cancerous conditions exhibit upregulated Notch. Thus, we envision that many 
more cancers, e.g., seminoma, melanoma, and lung cancer, can be treated or 
prevented by administration of an Antagonist Therapeutic. 

30 
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5.1.2. NFRVOUS SY STEM DISORDERS 
Nervous system disorders, involving cell types which can be tested 
as described supra for efficacy of intervention with Antagonist or Agonist 
Therapeutics, and which can be treated upon thus observing an indication of 
5 therapeutic utility, include but are not limited to nervous system injuries, and 
diseases or disorders which result in either a disconnection of axons, a diminution 
or degeneration of neurons, or demyelination. Nervous syste lesions which may 
be treated in a patient (including human and non-human mammalian patients) 
according to the invention include but are not limited to the following lesions of 
10 either the central (including spinal cord, brain) or peripheral nervous systems: 
(i) traumatic lesions, including lesions caused by physical 
injury or associated with surgery, for example, lesions 
which sever a portion of the nervous system, or 
compression injuries; 

15 (H) ischemic lesions, in which a lack of oxygen in a portion of 

the nervous system results in neuronal injury or death, 
including cerebral infarction or ischemia, or spinal cord 
infarction or ischemia; 

(iii) malignant lesions, in which a portion of the nervous system 
20 is destroyed or injured by malignant tissue which is either a 

nervous system associated malignancy or a malignancy 
derived from non-nervous system tissue; 

(iv) infectious lesions, in which a portion of the nervous system 
is destroyed or injured as a result of infection, for 

25 exmple, by an abscess or associated with infection by 

human immunodeficiency vin.s, herpes zoster, or herpes 
simplex virus or with Lyme disease, tuberculosis, syphilis; 

(v) degenerative lesions, in which a portion of the nervous 
system is destroyed or injured as a result of a degenerative 

30 process including but not limited to degeneration associated 
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with Parkinson's disease, Alzheimer's disease, 
Huntington's chorea, or amyotrophic lateral sclerosis; 

(vi) lesions associated with nutritional diseases or disorders, in 
which a portion of the nervous system is destroyed or 
injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic 
acid deficiency, Wernicke disease, tobacco-alcohol 
amblyopia, Marchiafava-Bignami disease (primary 
degeneration of the corpus callosum), and alcoholic 
cerebellar degeneration; 

(vii) neurological lesions associated with systemic diseases 
including but not limited to diabetes (diabetic neuropathy, 
BelPs palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(viii) lesions caused by toxic substances including alcohol, lead, 
or particular neurotoxins; and 

(ix) demyelinated lesions in which a portion of the nervous 
system is destroyed or injured by a demyelinating disease 
including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse 
myelopathy or various etiologies, progressive multifocal 
leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for 
treatment of a nervous system disorder may be selected by testing for biological 
activity in promoting the survival or differentiation of neurons (see also Section 
5.1). For example, and not by way of limitation, Therapeutics which elicit any 
of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 
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(iii) increased production of a neuron-associated molecule in 
culture or in vivo, e.g., choline acetyltransferase or 
acetylcholinesterase with respect to motor neurons; or 

(iv) decreased symptoms of neuron dysfunction in vivo. 

5 Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the 
method set forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased 
sprouting of neurons may be detected by methods set forth in Pestronk et al. 
(1980, Exp. Neurol. 70:65-82) or Brown et al. (1981, Ann. Rev. Neurosci. 

10 4:17-42); increased production of neuron-associated molecules may be measured 
by bioassay, enzymtic assay, anybody t.nding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be 
measured by assessing the physical manifestation of motor neuron disorder, e.g. , 
weakness, motor neuron conduction velocity, or functional disability. 

IS in a specific embodiments, motor neuron disorders that may be 

treated according to the invention include but are not limited to disorders such as 
infarction, infection, exposure to toxin, trauma, surgical damage, degenerative 
disease or malignancy that may affect motor neurons as well as other components 
of the nervous system, as well as disorders that selectively affect neurons such as 

20 amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and 
juvenile muscular atrophy, progressive bulbar paralysis of childhood (Fazio- 
Londe syndrome), poliomyelitis and the post polio syndrome, and Hereditary 
Motorsensory Neuropathy (Charcot-Marie-Tooth Disease). 

25 

5.1.3. TISSUE REPAIR AND REGENERATION 
In another embodiment of the invention, a Therapeutic of the 
invention is used for promotion of tissue regeneration and repair, including but 
not limited to treatment of benign dysproliferative disorders. Specific 
30 embodiments are directed to treatment of cirrhosis of the liver (a condition in 
which scarring has overtaken normal liver regeneration processes), treatment of 
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keloid (hypertrophic scar) formation (disfiguring of the skin in which the scarring 
process interferes with normal renewal), psoriasis (a common skin condition 
characterized by excessive proliferation of the skin and delay in proper cell fate 
determination), and baldness (a condition in which terminally differentiated hair 
5 follicles (a tissue rich in Notch) fail to function properly). 

5.2. PROPHYLACTIC USES 
5.2.1. MALIGNANCIES 
The Therapeutics of the invention can be administered to prevent 

10 progression to a neoplastic or malignant state, including but not limited to those 
disorders listed in Table 1. Such administration is indicated where the 
Therapeutic is shown in assays, as described supra, to have utility for treatment 
or prevention of such disorder. Such prophylactic use is indicated in conditions 
known or suspected of preceding progression to neoplasia or cancer, in particular, 

15 where non-neoplastic cell growth consisting of hyperplasia, metaplasia, or most 
particularly, dysplasia has occurred (for review of such abnormal growth 
conditions, see Robbins and Angel 1, 1976, Basic Pathology, 2d Ed., W.B. 
Saunders Co., Philadelphia, pp. 68-79.) Hyperplasia is a form of controlled cell 
proliferation involving an increase in cell number in a tissue or organ, without 

20 significant alteration in structure or function. As but one example, endometrial 
hyperplasia often precedes endometrial cancer. Metaplasia is a form of controlled 
cell growth in which one type of adult or fully differentiated cell substitutes for 
another type of adult cell. Metaplasia can occur in epithelial or connective tissue 
cells. Atypical metaplasia involves a somewhat disorderly metaplastic epithelium. 

25 Dysplasia is frequently a forerunner of cancer, and is found mainly in the 

epithelia; it is the most disorderly form of non-neoplastic cell growth, involving a 
loss in individual cell uniformity and in the architectural orientation of cells. 
Dysplastic cells often have abnormally large, deeply stained nuclei, and exhibit 
pleomorphism. Dysplasia characteristically occurs where there exists chronic 

30 irritation or inflammation, and is often found in the cervix, respiratory passages, 
oral cavity, and gall bladder. 
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Alternatively or in addition to the presence of abnormal cell 
growth characterized as hyperplasia, metaplasia, or dysplasia, the presence of one 
or more characteristics of a transformed phenotype, or of a malignant phenotype, 
displayed in vivo or displayed in vitro by a cell sample from a patient, can 

S indicate the desirability of prophylactic/therapeutic administration of a Therapeutic 
of the invention. As mentioned supra, such characteristics of a transformed 
phenotype include morphology changes, looser substratum attachment, loss of 
contact inhibition, loss of anchorage dependence, protease release, increased 
sugar transport, decreased serum requirement, expression of fetal antigens, 

10 disappearance of the 250,000 dalton cell surface protein, etc. (see also id., at pp. 
84-90 for characteristics associated with a transformed or malignant phenotype). 

In a specific embodiment, leukoplakia, a benign-appearing 
hyperplastic or dysplastic lesion of the epithelium, or Bowen's disease, a 
carcinoma in situ, are pre-neoplastic lesions indicative of the desirability of 

15 prophylactic intervention. 

In another embodiment, fibrocystic disease (cystic hyperplasia, 
mammary dysplasia, particularly adenosis (benign epithelial hyperplasia)) is 
indicative of the desirability of prophylactic intervention. 

In other embodiments, a patient which exhibits one or more of the 

20 following predisposing factors for malignancy is treated by administration of an 
effective amount of a Therapeutic: a chromosomal translocation associated with a 
malignancy (e.g., the Philadelphia chromosome for chronic myelogenous 
leukemia, t(14;18) for follicular lymphoma, etc.), familial polyposis or Gardner's 
syndrome (possible forerunners of colon cancer), beni?n monoclonal gammopathy 

25 (a possible forerunner of multiple myeloma), and a first degree kinship with 
persons having a cancer or precancerous disease showing a Mendelian (genetic) 
inheritance pattern (e.g., familial polyposis of the colon, Gardner's syndrome, 
hereditary exostosis, polyendocrine adenomatosis, medullary thyroid carcinoma 
with amyloid production and pheochromocytoma, Peutz-Jeghers syndrome, 

30 neurofibromatosis of Von Recklinghausen, retinoblastoma, carotid body tumor, 
cutaneous melanocarcinoma, intraocular melanocarcinoma, xeroderma 
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pigmentosum, ataxia telangiectasia, Chediak-Higashi syndrome, albinism, 
Fanconi's aplastic anemia, and Bloom's syndrome; see Robbins and Angell, 1976, 
Basic Pathology, 2d Ed., W.B. Saunders Co., Philadelphia, pp. 112-113) etc.) 

In another specific embodiment, an Antagonist Therapeutic of the 
5 invention is administered to a human patient to prevent progression to breast, 

t 

colon, or cervical cancer. 

5.2.2. OTHER DISORDERS 
In other embodiments, a Therapeutic of the invention can be 
10 administered to prevent a nervous system disorder described in Section 5.1.2, or 
other disorder {e.g., liver cirrhosis, psoriasis, keloids, baldness) described in 
Section 5.1.3. 

15 5.3. DEMONSTRATION OF THERAPEUTIC 

OR PRQPH YL ^ CT1C UTILITY 

The Therapeutics of the invention can be tested in vivo for the 
desired therapeutic or prophylactic activity. For example, such compounds can 
be tested in suitable animal model systems prior to testing in humans, including 
20 but not limited to rats, mice, chicken, cows, monkeys, rabbits, etc. For in vivo 
testing, prior to administration to humans, any animal model system known in the 
art may be used. 

5.4. THERAPEUTIC/PROPHYLACTIC 

ADMINISTRAT ION AND COMPOSITIONS 

The invention provides methods of treatment (and prophylaxis) by 

administration to a subject of an effective amount of a Therapeutic of the 

invention. In a preferred aspect, the Therapeutic is substantially purified. The 

subject is preferably an animal, including but not limited to animals such as cows, 

pigs, chickens, etc., and is preferably a mammal, and most preferably human. 

Various delivery systems are known and can be used to administer 

1 a Therapeutic of the invention, e.g., encapsulation in liposomes, microparticles, 
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\ microcapsules, expression by recombinant cells, receptor-mediated endocytosis 
[ (see, e.g., Wu and Wu, 1987, J. Biol. Chem. 262:4429-4432), construction of a 
Therapeutic nucleic acid as part of a retroviral or other vector, etc. Methods of 
! introduction include but are not limited to intradermal, intramuscular, 
5 intraperitoneal, intravenous, subcutaneous, intranasal, and oral routes. The 
compounds may be administered by any convenient route, for example by 
infusion or bolus injection, by absorption through epithelial or mucocutaneous 
linings (e.g., oral mucosa, rectal and intestinal mucosa, etc.) and may be 
administered together with other biologically active agents. Administration can be 

10 systemic or local. In addition, it may be desirable to introduce the 

pharmaceutical compositions of the invention into the central nervous system by 
any suitable route, including intraventricular and intrathecal injection; 
intraventricular injection may be facilitated by an intraventricular catheter, for 
example, attached to a reservoir, such as an Ommaya reservoir. 

15 in a specific embodiment, it may be desirable to administer the 

pharmaceutical compositions of the invention locally to the area in need of 
treatment; this may be achieved by, for example, and not by way of limitation, 
local infusion during surgery, topical application, e.g., in conjunction with a 
wound dressing after surgery, by injection, by means of a catheter, by means of a 

20 suppository, or by means of an implant, said implant being of a porous, non- 
porous, or gelatinous material, including membranes, such as sialastic 
membranes, or fibers. In one embodiment, administra^on can be by direct 
injection at the site (or former site) of a malignant tumor or neoplastic or pre- 
neoplastic tissue. 

25 - in a specific embodiment, administration of a Therapeutic into a 

Notch-expressing cell is accomplished by linkage of the Therapeutic to a Delta (or 
i other toporythmic) protein or portion thereof capable of mediating binding to 
Notch. Contact of a Notch-expressing cell with the linked Therapeutic results in 
binding of the linked Therapeutic via its Delta portion to Notch on the surface of 

30 the cell, followed by uptake of the linked Therapeutic into the Notch-expressing 
cell. 
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In a specific embodiment wherein an analog of a Notch 
intracellular signal-transducing domain is employed as a Therapeutic, such that it 
can inhibit Notch signal transduction, the analog is preferably delivered 
intracellular^ (e.g., by expression from a nucleic acid vector, or by linkage to a 
Delta protein capable of binding to Notch followed by binding and internalization, 
or by receptor-mediated, mechanisms). 

In a specific embodiment where the Therapeutic is a nucleic acid 
encoding a protein Therapeutic, the nucleic acid can be administered in vivo to 
promote expression of its encoded protein, by constructing it as part of an 
appropriate nucleic acid expression vector and administering it so that it becomes 
intracellular, e.g., by use of a retroviral vector (see U.S. Patent No. 4,980,286), 
or by direct injection, or by use o microparticle bombardment (e.g., a gene gun; 
Biolistic, Dupont), or coating with lipids or cell-surface receptors or transfecting 
agents, or by administering it in linkage to a homeobox-like peptide which is 
known to enter the nucleus (see e.g., Joliot et al. f 1991, Proc. Natl. Acad. Sci. 
USA 88:1864-1868), etc. Alternatively, a nucleic acid Therapeutic can be 
introduced intracellular^ and incorporated within host ceil DNA for expression, 
by homologous recombination. 

In specific embodiments directed to treatment or prevention of 
particular disorders, preferably the following forms of administration are used: 



Disorder 
Cervical cancer 
Gastrointestinal cancer 
Lung cancer 
Leukemia 

Metastatic carcinomas 

Brain cancer 

Liver cirrhosis 

Psoriasis 

Keloids 

Baldness 



Preferred Forms of Administration 
Topical 

Ural; intravenous 
Inhaled; intravenous 
Intravenous; extracorporeal 
Intravenous; oral 

Targeted; intravenous; intrathecal 

Oral; intravenous 

Topical 

Topical 

Topical 
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Spinal cord injury Targeted; intravenous; intrathecal 

Parkinson's disease Targeted; intravenous; intrathecal 

Motor neuron disease Targeted; intravenous; intrathecal 

Alzheimer's disease Targeted; intravenous; intrathecal 

5 

The present invention also provides pharmaceutical compositions. 
Such compositions comprise a therapeutically effective amount of a Therapeutic, 
and a pharmaceutic^ acceptable carrier or excipient. Such a carrier includes 
but is not limited to saline, buffered saline, dextrose, water, glycerol, ethanol, 
10 and combinations thereof. The carrier and composition can be sterile. The 
formulation should suit the mode of aJ ..^istration. 

The composition, if desired, can also contain minor amounts of 
wetting or emulsifying agents, or pH buffering agents. The composition can be a 
liquid solution, suspension, emulsion, tablet, pill, capsule, sustained release 
15 formulation, or powder. The composition can be formulated as a suppository, 
with traditional binders and carriers such as triglycerides. Oral formulation can 
include standard carriers such as pharmaceutical grades of mannitol, lactose, 
starch, magnesium stearate, sodium saccharine, cellulose, magnesium carbonate, 
etc. 

20 in a preferred embodiment, the composition is formulated in 

accordance with routine procedures as a pharmaceutical composition adapted for 
intravenous administration to human beings. Typically, compositions for 
intravenous administration are solutions in sterile isotonic aqueous buffer. Where 
necessary, the composition may also include a solubilizing agent and a local 

25 anesthetic such as lignocaine to ease pain at the site of the injection. Generally, 
the ingredients are supplied either separately or mixed together in unit dosage 
form, for example, as a dry lyophilized powder or water free concentrate* in a 
hermetically sealed container such as an ampoule or sachette indicating the 
quantity of active agent. Where the composition is to be administered by 

30 infusion, it can be dispensed with an infusion bottle containing sterile 

pharmaceutical grade water or saline. Where the composition is administered by 
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injection, an ampoule of sterile water for injection or saline can be provided so 
that the ingredients may be mixed prior to administration. 



salt forms. Pharmaceutical ly acceptable salts include those formed with free 
5 amino groups such as those derived from hydrochloric, phosphoric, acetic, oxalic, 
tartaric acids, etc., and those formed with free carboxyl groups such as those 
derived from sodium, potassium, ammonium, calcium, ferric hydroxides, 
isopropylamine, triethylamine, 2-ethylamino ethanol, histidine, procaine, etc. 



10 effective in the treatment of a particular disorder or condition will depend on the 
nature of the disord-r or condition, and can be determined by standard clinical 
techniques. In addition, in vitro assays may optionally be employed to help 
identify optimal dosage ranges. The precise dose to be employed in the 
formulation will also depend on the route of administration, and the seriousness of 

15 the disease or disorder, and should be decided according to the judgment of the 
practitioner and each patient's circumstances. However, suitable dosage ranges 
for intravenous administration are generally about 20-500 micrograms of active 
compound per kilogram body weight. Suitable dosage ranges for intranasal 
administration are generally about 0.01 pg/kg body weight to 1 mg/kg body 

20 weight. Effective doses may be extrapolated from dose-response curves derived 
from in vitro or animal model test systems. 

Suppositories generally contain active ingredient in the range of 
0.5% to 10% by weight; oral formulations preferably contain 10% to 95% active 
ingredient. 

25 The invention also provides a pharmaceutical pack or kit 

comprising one or more containers filled with one or more of the ingredients of 
the pharmaceutical compositions of the invention. Optionally associated with such 
container(s) can be a notice in the form prescribed by a governmental agency 
regulating the manufacture, use or sale of pharmaceuticals or biological products, 

30 which notice reflects approval by the agency of manufacture, use or sale for 
human administration. 



The Therapeutics of the invention can be formulated as neutral or 



The amount of the Therapeutic of the invention which will be 
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5.5. ANTISENSE REGULATION OF NOTCH EXPRESSION 

The present invention provides the therapeutic or prophylactic use 
of nucleic acids of at least six nucleotides that are antisense to a gene or cDNA 
encoding Notch or a portion thereof. "Antisense" as used herein refers to a 
5 nucleic acid capable of hybridizing to a portion of a Notch RNA (preferably 
mRNA) by virtue of softie sequence complementarity. Such antisense nucleic 
acids have utility as Antagonist Therapeutics of the invention, id can be used in 
the treatment or prevention of disorders as described supra in Section 5.1 and its 
subsections. 

10 The antisense nucleic acids of the invention can be oligonucleotides 

that are double-stranded or single-stranued, RNA or DNA or a modification or 
derivative thereof, which can be directly administered to a cell, or which can be 
produced intracellular^ by transcription of exogenous, introduced sequences. 

In a specific embodiment, the Notch antisense nucleic acids 

15 provided by the instant invention can be used for the treatment of tumors or other 
disorders, the cells of which tumor type or disorder can be demonstrated (in vitro 
or in vivo) to express the Notch gene. Such demonstration can be by detection of 
Notch RNA or of Notch protein. 

The invention further provides pharmaceutical compositions 

20 comprising an effective amount of the Notch antisense nucleic acids of the 

invention in a pharmaceutical^ acceptable carrier, as described supra in Section 
5.4. Methods for treatment and prevention of disorders (such as those described 
in Sections 5.1 and 5.2) comprising administering the pharmaceutical 
compositions of the invention are also provided. 

25 „ In another embodiment, the invention is directed to methods for 

inhibiting the expression of a Notch nucleic acid sequence in a prokaryotic or 
eukaryotic cell comprising providing the cell with an effective amount of a 
composition comprising an antisense Notch nucleic acid of the invention. 

In another embodiment, the identification of cells expressing 

30 functional Notch receptors can be carried out by observing the ability of Notch to 
"rescue" such cells from the cytotoxic effects of a Notch antisense nucleic acid. 
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In an alternative embodiment of the invention, nucleic acids 
antisense to a nucleic acid encoding a ("adhesive") toporythmic protein or 
fragment that binds to Notch, are envisioned as Therapeutics. 

Notch antisense nucleic acids and their uses are described in detail 

below. 

» 

5.5.1. NOTCH ANTISENSE NUCLEIC ACIDS 
The Notch antisense nucleic acids are of at least six nucleotides 
and are preferably oligonucleotides (ranging from 6 to about 50 oligonucleotides). 
In specific aspects, the oligonucleotide is at least 10 nucleotides, at least 15 
nucleotides, at least 100 nucleotides, or at least 200 nucleotides. The 
oligonucleotides can be DNA or RNA or chimeric mixtures or derivatives or 
modified versions thereof, single-stranded or double-stranded. The 
oligonucleotide can be modified at the base moiety, sugar moiety, or phosphate 
backbone. The oligonucleotide may include other appending groups such as 
peptides, or agents facilitating transport across the cell membrane (see, e.g., 
Letsinger et ah, 1989, Proc. Natl. Acad. Sci. U.S.A. 86:6553-6556; Lemaitre et 
al., 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT Publication No. 
WO 88/09810, published December 15, 1988) or blood-brain barrier (see, e.g., 
PCT Publication No. WO 89/10134, published April 25, 1988), hybridization- 
triggered cleavage agents (see, e.g., Krol et al., 1988, BioTechniques 6:958-976) 
or intercalating agents (see, e.g., Zon, 1988, Pharm. Res. 5:539-549). 

In a preferred aspect of the invention, a Notch antisense 
oligonucleotide is provided, preferably of single-stranded DNA. In a most 
preferred aspect, such an oligonucleotide comprises a sequence antisense to the 
sequence encoding ELR 11 and ELR 12 of Notch, most preferably, of human 
Notch. The oligonucleotide may be modified at any position on its structure with 
substituents generally known in the art. 

The Notch antisense oligonucleotide may comprise at least one 
modified base moiety which is selected from the group including but not limited 
to 5-fluorouracil, 5-bromouracil, 5-chIorouracil, 5-iodouraciU hypoxanthine, 
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xantine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 
5-carboxymethyIaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, 
dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 

1- methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 
5 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 

7-methylguanine, 5-methyIaminomethyluracil, 5-methoxyaminomethyl- 

2- thiouracil, beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 
5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, uracil-5-oxyaceUc acid (v), 
wybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 

10 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, 
uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracii, 3-(3-amino-3-N-2- 
carboxypropyl) uracil, (acp3)w, a-.d 2,6-diaminopurine. 

In another embodiment, the oligonucleotide comprises at least one 
modified sugar moiety selected from the group including but not limited to 

15 arabinose, 2-fluoroarabinose, xylulose, and hexose. 

In yet another embodiment, the oligonucleotide comprises at least 
one modified phosphate backbone selected from the group consisting of a 
phosphorothioate, a phosphorodithioate, a phosphoramidothioate, a 
phosphoramidate, a phosphordiamidate, a methylphosphonate, an alkyl 

20 phosphotriester, and a formacetal or analog thereof. 

In yet another embodiment, the oligonucleotide is an a-anomeric 
oligonucleotide. An a-anomeric oligonucleotide forms specific double-stranded 
hybrids with complementary RNA in which, contrary to the usual 0-units, the 
strands run parallel to each other (Gautier et al., 1987, Nucl. Acids Res. 

25 15:6625-6641). 

The oligonucleotide may be conjugated to another molecule, e.g., 
a peptide, hybridization triggered cross-linking agent, transport agent, 
hybridization-triggered cleavage agent, etc. 

Oligonucleotides of the invention may be synthesized by standard 
30 methods known in the art, e.g. by use of an automated DNA synthesizer (such a 
are commercially available from Biosearch, Applied Biosystems, etc.). As 
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examples, phosphorothioate oligos may be synthesized by the method of Stein et 
al. (1988, Nucl. Ac.ds Res. 16:3209), methylphosphonate oligos can be prepared 
by use of controlled pore glass polymer supports (Sarin et al., 1988, Proc. Natl. 
Acad. Sci. U.S.A. 85:7448-7451), etc. 
5 In a specific embodiment, the Notch antisense oligonucleotide 

comprises catalytic RNA, or a ribozyme (see, e.g., PCT International Publication 
WO 90/11364, published October 4, 1990; Sarver et al., 1990, Science 247:1222- 
1225). In another embodiment, the oligonucleotide is a 2 '-0-methyl ribonucleotide 
(Inoue et al., 1987, Nucl. Acids Res. 15:6131-6148), or a chimeric RNA-DNA 
10 analogue (Inoue et al., 1987, FEBS Lett. 215:327-330). 

In an alternative embodiment, the Notch antisense nucleic acid of 
the invention is produced intracellular^ by transcription from an exogenous 
sequence. For example, a vector can be introduced in vivo such that it is taken 
up by a cell, within which cell the vector or a portion thereof is transcribed, 
IS producing an antisense nucleic acid (RNA) of the invention. Such a vector would 
contain a sequence encoding the Notch antisense nucleic acid. Such a vector can 
remain episomal or become chromosomally integrated, as long as it can be 
transcribed to produce the desired antisense RNA. Such vectors can be 
constructed by recombinant DNA technology methods standard in the art. 
20 Vectors can be plasmid, viral, or others known in the art, used for replication and 
expression in mammalian cells. Expression of the sequence encoding the Notch 
antisense RNA can be by any pomoter known in the art to act in mammalian, 
preferably human, cells. Such promoters can be inducible or constitutive. Such 
promoters include but are not limited to: the SV40 early promoter region 
25 (Bemoist and Chambon, 1981, Nature 290:304-310), the promoter contained in 
the 3' long terminal repeat of Rous sarcoma vims (Yamamoto et al., 1980, Cell 
22:787-797), the herpes thymidine kinase promoter (Wagner et al., 1981, Proc. 
Natl. Acad. Sci. U.S.A. 78:1441-1445), the regulatory sequences of the 
metallothionein gene (Brinster et al., 1982, Nature 296:39-42), etc. 
30 The antisense nucleic acids of the invention comprise a sequence 

complementary to at least a portion of an RNA transcript of a Notch gene, 
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preferably a human Notch gene. However, absolute complementarity, although 
preferred, is not required. A sequence "complementary to at least a portion of an 
RNA," as referred to herein, means a sequence having sufficient complementarity 
to be able to hybridize with the RNA, forming a stable duplex; in the case of 

5 double-stranded Notch antisense nucleic acids, a single strand of the duplex DNA 
may thus be tested, or triplex formation may be assayed. The ability to hybridize 
will depend on both the degree of complementarity and the length of the antisense 
nucleic acid. Generally, the longer the hybridizing nucleic acid, the more base 
mismatches with a Notch RNA it may contain and still form a stable duplex (or 

10 triplex, as the case may be). One skilled in the art can ascertain a tolerable 

degree of mismatch by use of standard procedures to determine the melting point 
of the hybridized complex. 



5.5.2. THERAPEUTIC UTILITY OF NOTCH 
15 ANTISENSE NUCLEIC ACIDS 

The Notch antisense nucleic acids can be used to treat (or prevent) 

malignancies, of a cell type which has been shown to express Notch RNA. 

Malignant, neoplastic, and preneoplastic cells which can be tested for such 

expression include but are not limited to those described supra in Sections 5.1.1 

2Q and 5.2.1. In a preferred embodiment, a single-stranded DNA antisense Notch 
oligonucleotide is used. 

Malignant (particularly, tumor) cell types which express Notch 
RNA can be identified by various methods known in the art. Such methods 
include but are not limited to hybridization with a Notch-specific nucleic acid 

25 (e.g. by Northern hybridization, dot blot hybridization, in situ hybridization), 
observing the ability of RNA from the cell type to be translated in vixro into 
Notch, etc. In a preferred aspect, primary tumor tissue from a patient can be 
assayed for Notch expression prior to treatment. 

Pharmaceutical compositions of the invention (see Section 5.1.4), 

30 comprising an effective amount of a Notch antisense nucleic acid in a 

pharmaceutical^ acceptable carrier, can be administered to a patient having a 
malignancy which is of a type that expresses Notch RNA. 
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The amount of Notch antisense nucleic acid which will be effective 
in the treatment of a particular disorder or condition will depend on the nature of 
the disorder or condition, and can be determined by standard clinical techniques. 
Where possible, it is desirable to determine the antisense cytotoxicity of the tumor 
5 type to be treated in vitro, and then in useful animal model systems prior to 
testing and use in humahs. 

In a specific embodiment, pharmaceutical com . sitinns comprising 
Notch antisense nucleic acids are administered via liposomes, niicroparticles, or 
microcapsules. In various embodiments of the invention, it may be useful to use 
10 such compositions to achieve sustained release of the Notch antisense nucleic 

acids. In a specific embodiment, it may be desirable to utilize liposomes targeted 
via antibodies to specific identifiable tumor antigens (Leonetti et ah, 1990, Proc. 
Natl. Acad. Sci. U.S.A. 87:2448-2451; Renneisen et al.. 1990, J. Biol. Chem. 
265:16337-16342). 

15 

5.6. DIAGNOSTIC UTILITY 
Notch proteins, analogues, derivatives, and subsequences thereof, 
Notch nucleic acids (and sequences complementary thereto), anti-Notch 
antibodies, and other toporythmic proteins and derivatives and analogs thereof 

20 which interact with Notch proteins, and inhibitors of North-toporythmic protein 
interactions, have uses in diagnostics. Such molecules can be used in assays, 
such as immunoassays, to detect, prognose, diagnose, or monitor various 
conditions, diseases, and disorders affecting Notch expression, or monitor the 
treatment thereof. In particular, such an immunoassay is carried out by a method 

25 comprising contacting a -imple derived from a patient with an anti-Notch 
antibody under conditions such that immunospecifL binding can occur, and 
detecting or measuring the amount of any immunospecific binding by the 
antibody. In a specific embodiment, antibody to Notch can be used to assay in a 
patient tissue or serum sample for the presence of Notch where an aberrant level 

30 of Notch is an indication of a diseased condition. 
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The immunoassays which can be used include but are not limited 
to competitive and non-competitive assay systems using techniques such as 
western blots, radioimmunoassays, ELISA (enzyme linked immunosorbent assay), 
"sandwich" immunoassays, immunoprecipitation assays, precipitin reactions, gel 
diffusion precipitin reactions, immunodiffusion assays, agglutination assays, 

complement-fixation assays, immunoradiometric assays, fluorescent 

» 

immunoassays, protein A immunoassays, to name but a few. 

N"tch genes and related nucleic acid sequences and subsequences, 
including complementary sequences, and other toporythmic gene sequences, can 
also be used in hybridization assays. Notch nucleic acid sequences, or 
subsequences thereof comprising abo\. J, least 8 nucleotides, can be used as 
hybridization probes. Hybridization assays can be used to detect, prognose, 
diagnose, or monitor conditions, disorders, or disease states associated with 
aberrant changes in Notch expression and/or activity as described supra. In 
particular, such a hybridization assay is carried out by a method comprising 
contacting a sample containing nucleic acid with a nucleic acid probe capable of 
hybridizing to Notch DNA or RNA, under conditions such that hybridization can 
occur, and detecting or measuring any resulting hybridization. 

As detailed in examples section 10.1 infra, increased Notch 
expression occurs in human breast, colon, and cervical cancer. Accordingly, in 
specific embodiments, human breast, colon, or cervical cancer or premalignant 
changes in such tissues is diagnosed by detecting increased Notch expression (or 
amount) in patient samples relative to the level or Notch expression (or amount) 
in an analogous non-malignant, or non-premalignant, as the case may be, sample 
(from the patient or another person, as determined experimentally or as is known 
as a standard level in such samples). 

In one embodiment, the Notch protein (or derivative having Notch 
antigenicity) that is detected or measured is on the cell surface. In another 
embodiment, the Notch protein (or derivative) is a cell free soluble molecule 
(e.g., as measured in a blood or serum sample) or is intracellular. Without 
intending to be bound mechanistically, Applicants believe that cell free Notch may 
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result from secretion or shedding from the cell surface. In yet another 
embodiment, soluble, cell-surface, and intracellular amounts of Notch protein or 
derivative are detected or measured. 



5 5.7. NOTCH NUCLEIC ACIDS 

Therapeutics of the invention which are Notch nucleic acids or 
Notch antisense nucleic acids, as well as nucleic acids encoding protein 
Therapeutics, include those described below, which can be obtained by methods 
known in the art, and in particular, as described below. 

10 In particular aspects, the invention provides amino acid sequences 

of Notch, preferably human Notch, and fragments and derivatives thereof which 
comprise an antigenic determinant (i.e., can be recognized by an antibody) or 
which are functionally active, as well as nucleic acid sequences encoding the 
foregoing. "Functionally active" material as used herein refers to that material 

15 displaying one or more known functional activities associated with the full-length 
(wild-type) Notch protein product, e.g., binding to Delta, binding to Serrate, 
binding to any other Notch ligand, antigenicity (binding to an anti-Notch 
antibody), etc. 

In specific embodiments, the invention provides fragments of a 
20 Notch protein consisting of at least 40 amino acids, or of at least 75 amino acids. 
In other embodiments, the proteins comprise or consist essentially of the 
intracellular domain, transmembrane region, extracellular domain, cdclO region, 
Notch /lin-12 repeats, or the EGF-homologous repeats, or any combination of the 
foregoing, of a Notch protein. Fragments, or proteins comprising fragments, 
25 lacking some or all of the EGF-homologous repeats of Notch are also provided. 
Nucleic acids encoding the foregoing are provided. 

In other specific embodiments, the invention provides nucleotide 
sequences and subsequences of Notch , preferably human Notch , consisting of at 
least 25 nucleotides, at least 50 nucleotides, or at least 150 nucleotides. Nucleic 
30 acids encoding the proteins and protein fragments described above are provided, 
as well as nucleic acids complementary to and capable of hybridizing to such 
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nucleic acids. In one embodiment, such a complementary sequence may be 
complementary to a Notch cDNA sequence of at least 25 nucleotides, or of at 
least 100 nucleotides. In a preferred aspect, the invention utilizes cDNA 
sequences encoding human Notch or a portion thereof. In a specific embodiment, 
5 such sequences of the human Notch gene or cDNA are as contained in plasmids 
hN3k, hN4k, or hN5k,(see Section 9, infra) or in the gene corresponding thereto; 
such a human Notch protein sequence can be as shown in Figures 10 (SEQ ID 
NO: 11) or 11 (SEQ ID NO: 13). In other embodiments, the Notch nucleic acid 
and/or its encoded protein has at least a portion of the sequence shown in one of 

10 the following publications: Wharton et al., 1985, Cell 43:567-581 {Drosophila 
Notch); Kidd et al., 1986, Mol. Cell. Biol. 6:3094-3108 (Drosophila Notch); 
Coffman et al., 1990, Science 249:1438-1441 {Xenopus Notch); Ellisen et al., 
1991, Cell 66:649-661 (a human Notch). In another aspect, the sequences of 
human Notch are those encoding the human Notch amino acid sequences or a 

15 portion thereof as shown in Figure 13. In a particular aspect, the human Notch 
sequences are those of the hN homolog (represented in part by plasmid hN5k) or 
the TAN-1 homolog. 

In one embodiment of the invention, the invention is directed to 
the lull-length human Notch protein encoded by the hN homolog as depicted in 

20 Figure 13, both containing the signal sequence (Le. 9 the precursor protein; amino 
acids 1-2169) and lacking the signal sequence (i.e., the mature protein; amino 
acids -26-2169), as well as portions of the foregoing (e.g., the extracellular 
domain, EGF homologous repeat region, EGF-like repeats 11 and 12, 
cdc-10/ankyrin repeats, etc.) and proteins comprising the foregoing, as well as 

25 nucleic acids encoding the foregoing. 

As is readily apparent, as used herein, a "nucleic acid encoding a 
fragment or portion of a Notch protein" shall be construed as referring to a 
nucleic acid encoding only the recited fragment or portion of the Notch protein 
and not other portions of the Notch protein. 

30 
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In a preferred, but not limiting, aspect of the invention, a human 
Notch DNA sequence can be cloned and sequenced by the method described in 
Section 9, infra. 

In another preferred aspect, PCR is used to amplify the desired 
5 sequence in the library, prior to selection. For example, oligonucleotide primers 
representing part of the adhesive domains encoded by a homologue of the desired 
gene can be used as primers in PCR. 

The above-methods are not meant to limit the following general 
description of methods by which clones of Notch may be obtained. 

10 Any eukaryotic cell can potentially serve as the nucleic acid source 

for the molecular cloning of the Notch gene. The DNA may be obtained by 
standard procedures known in the art from cloned DNA (e.g., a DNA "library"), 
by chemical synthesis, by cDNA cloning, or by the cloning of genomic DNA, or 
fragments thereof, purified from the desired human cell (see, for example 

IS Sambrook et al., 1989, Molecular Cloning, A Laboratory Manual, Cold Spring 
Harbor Laboratory, 2d. Ed., Cold Spring Harbor, New York; Glover, D.M. 
(ed,), 1985, DNA Cloning: A Practical Approach, MRL Press, Ltd., Oxford, 
U.K. Vol. I, II.) Clones derived from genomic DNA may contain regulatory and 
intron DNA regions in addition to coding regions; clones derived from cDNA 

20 will contain only exon sequences. Whatever the source, the gene should be 
molecularly cloned into a suitable vector for propagation of the gene. 

In the molecular cloning of the gene from genomic DNA, DNA 
fragments are generated, some of which will encode the desired gene. The DNA 
may be cleaved at specific sites using various restriction enzymes. Alternatively, 

25 one may use DNAse in the presence of manganese to fragment the DNA, or the 
DNA can be physically sheared, as for example, by sonication. The linear DNA 
fragments can then be separated according to size by standard techniques, 
including but not limited to, agarose and polyacrylamide gel electrophoresis and 
column chromatography. 

30 Once the DNA fragments are generated, identification of the 

specific DNA fragment containing the desired gene may be accomplished in a 
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number of ways. For example, if an amount of a portion of a Notch (of any 
species) gene or its specific RNA, or a fragment thereof e.g., the adhesive 
domain, is available and can be purified and labeled, the generated DNA 
fragments may be screened by nucleic acid hybridization to the labeled probe 
5 (Benton, W. and Davis, R., 1977, Science 196, 180; Grunstein, M. And 
Hogness, D., 1975, PrOc. Natl. Acad. Sci. U.S.A. 72, 3961). Those DNA 
fragments with substantial homology to the probe will hybrid iv It is also 
possible to identity the appropriate fragment by restriction enzyme digestion(s) 
and comparison of fragment sizes with those expected according to a known 

10 restriction map if such is available. Further selection can be carried out on the 
basis of the properties of the gene. Alternatively, the presence of the gene may 
be detected by assays based on the physical, chemical, or immunological 
properties of its expressed product. For example, cDNA clones, or DNA clones 
which hybrid-select the proper mRNAs, can be selected which produce a protein 

15 that, e.g., has similar or identical electrophoretic migration, isolectric focusing 
behavior, proteolytic digestion maps, in vitro aggregation activity 
("adhesiveness") or antigenic properties as known for Notch. If an antibody to 
Notch is available, the Notch protein may be identified by binding of labeled 
antibody to the putatively Notch synthesizing clones, in an ELISA (enzyme-linked 

20 immunosorbent assay)-type procedure. 

The Notch gene can also be identified by mRNA selection by 
nucleic acid hybridization followed by in vitro translation. In this procedure, 
fragments are used to isolate complementary mRNAs by hybridization. Such 
DNA fragments may represent available, purified Notch DNA of another species 

^ (£#., Drosophila). Immunoprecipitation analysis or functional assays (e.g., 

aggregation ability in virro\ see examples infra) of the In vitro translation products 
of the isolated products of the isolated mRNAs identifies the mRNA and, 
therefore, the complementary DNA fragments that contain the desired sequences. 
In addition, specific mRNAs may be selected by adsorption of polysomes isolated 

30 from cells to immobilized antibodies specifically directed against Notch or Delta 
protein. A radiolabelled Notch cDNA can be synthesized using the selected 
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mRNA (from the adsorbed polysomes) as a template. The radiolabelled mRNA 
or cDNA may then be used as a probe to identify the Notch DNA fragments from 
among other genomic DNA fragments. 

Alternatives to isolating the Notch genomic DNA include, but are 
5 not limited to, chemically synthesizing the gene sequence itself from a known 
sequence or making cDNA to the mRNA which encodes the Notch gene. For 
example, RNA for cDNA cloning of the Notch gene can be isolated from cells 
which express Notch. Other methods are possible and within the scope of the 
invention. 

10 The identified and isolated gene can then be inserted into an 

appropriate cloning vector. A large number of vector-host systems known in the 
art may be used. Possible vectors include, but are not limited to, plasmids or 
modified viruses, but the vector system must be compatible with the host cell 
used. Such vectors include, but are not limited to, bacteriophages such as lambda 

15 derivatives, or plasmids such as PBR322 or pUC plasmid derivatives. The 

insertion into a cloning vector can, for example, be accomplished by ligating the 
DNA fragment into a cloning vector which has complementary cohesive termini. 
However, if the complementary restriction sites used to fragment the DNA are 
not present in the cloning vector, the ends of the DNA molecules may be 

20 enzymatically modified. Alternatively, any site desired may be produced by 

ligating nucleotide sequences (linkers) onto the DNA termini; these ligated linkers 
may comprise specific chemic illy synthesized oligonucleotides encoding 
restriction endonuclease recognition sequences. In an alternative method, the 
cleaved vector and Notch or Pelt a gene may be modified by homopolymeric 

25 tailing. Recombinant molecules can be introduced into host cells via 

transformation, transfection, infection, electroporation, etc., so that many copies 
of the gene sequence are generated. 

In an alternative method, the desired gene may be identified and 
isolated after insertion into a suitable cloning vector in a "shot gun" approach. 

30 Enrichment for the desired gene, for example, by size fractionization, can be 
done before insertion into the cloning vector. 
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In specific embodiments, transformation of host cells with 
recombinant DNA molecules that incorporate the isolated Notch gene, cDNA, or 
synthesized DNA sequence enables generation of multiple copies of the gene. 
Thus, the gene may be obtained in large quantities by growing transformants, 
5 isolating the recombinant DNA molecules from the transformants and, when 
necessary, retrieving the inserted gene from the isolated recombinant DNA. 

The Notch sequences provided by the instant invention include 
those nucleotide sequences encoding substantially the same amino acid sequences 
as found in native Notch protein, and those encoded amino acid sequences with 

10 functionally equivalent amino acids, all as described in Section 5.6 Utfra for 
Notch derivatives. 

Similar methods tr those described supra can be used to obtain a 
nucleic acid encoding Delta, Serrate, or adhesive portions thereof, or other 
toporythmic gene of interest. In a specific embodiment, the Delta nucleic acid 

15 has at least a portion of the sequence shown in Figure 1 (SEQ ID NO:l). In 

another specific embodiment, the Serrate nucleic acid has at least a portion of the 
sequence shown in Figure 5 (SEQ ID NO:3). The nucleic acid sequences 
encoding toporythmic proteins can be isolated from porcine, bovine, feline, avian, 
equine, or canine, as well as primate sources and any other species in which 

20 homologs of known toporythmic genes [including but not limited to the following 
genes (with the publication of sequences in parentheses): Delta (Vassin et al., 
1987, EMBO J. 6, 3431-3440; Kopczynski et al., 1983, Genes Dev. 2, 1723- 
1735; note corrections to the Kopczynski et al. sequence found in Figure 1 hereof 
(SEQ ID NO:l and SEQ ID NO:2)) and Serrate (Fleming et al, 1990, Genes & 

25 Dev. 4, 2188-2201)] can be identified. Such sequences can be altered by 
substitutions, additions or deletions that provide for functionally equivalent 
molecules, as described supra. 

5.8. RECOMBINANT PRODUCTION OF PROTEIN THERAPEUTICS 
30 The nucleic acid coding for a protein Therapeutic of the invention 

can be inserted into an appropriate expression vector, Le. 9 a vector which 
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contains the necessary elements for the transcription and translation of the inserted 
protein-coding sequence. The necessary transcriptional and translational signals 
can also be supplied by the native toporythmic gene and/or its flanking regions. 
A variety of host-vector systems may be utilized to express the protein-coding 

5 sequence. These include but are not limited to mammalian cell systems infected 
with virus (e.g., vaccinia virus, adenovirus, etc.); insect cell systems infected 
with virus (e.g., baculovirus); microorganisms such as yeast containing yeast 
vectors, or bacteria transformed with bacteriophage, DNA, plasmid DNA, or 
cosmid DNA. The expression elements of vectors vary in their strengths and 

10 specificities. Depending on the host-vector system utilized, any one of a number 
of suitable transcription and translation elements may be used. In a specific 
embodiment, the adhesive portion of the Notch gene, e.g. , that encoding EGF- 
like repeats (ELR) 11 and 12, is expressed. In other specific embodiments, the 
human f^otch gene is expressed, or a sequence encoding a functionally active 

15 portion of human Notch. 

Any of the methods previously described for the insertion of DNA 
fragments into a vector may be used to construct expression vectors containing a 
chimeric gene consisting of appropriate transcriptional/translational control signals 
and the protein coding sequences. These methods may include in vitro 

20 recombinant DNA and synthetic techniques and in vivo recombinants (genetic 

recombination). Expression of nucleic acid sequence encoding a Notch protein or 
peptide fragment may be regula; ' by a second nucleic acid sequence so that the 
Notch protein or peptide is expressed in a host transformed with the recombinant 
DNA molecule. For example, expression of a Notch protein may be controlled 

25 by any promoter/enhoiicer element known in the art. Promoters which may be 
used to control toporythmic gene expression include, but are not limited to, the 
SV40 early promoter region (Bernoist and Chambon, 1981, Nature 290, 304- 
310), the promoter contained in the 3' long terminal repeat of Rous sarcoma virus 
(Yamamoto, et al., 1980, Cell 22, 787-797), the herpes thymidine kinase 

30 promoter (Wagner et al., 1981, Proc. Natl. Acad. Sci. U.S.A. 78, 1441-1445), 
the regulatory sequences of the metallothionein gene (Brinster et al., 1982, Nature 
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296, 39-42); prokaryotic expression vectors such as the /3-lactamase promoter 
(Villa-Kamaroff, et al. t 1978, Proc. Natl. Acad. Sci. U.S.A. 75, 3727-3731), or 
the lac promoter (DeBoer f et al., 1983, Proc. Natl. Acad. Sci. U.S.A. 80, 21- 
25); see also "Useful proteins from recombinant bacteria" in Scientific American, 
5 1980, 242, 74-94; plant expression vectors comprising the nopaline synthetase 
promoter region (Herrera-Estrella et al., Nature 303, 209-213) or the cauliflower 
mosaic virus 35S RNA promoter (Gardner, et al., 1981, Nucl. Acids Res. 9, 
2871), and the promoter of the photosynthetic enzyme ribulose biphosphate 
carboxylase (Herrera-Estrella et al., 1984, Nature 310, 115-120); promoter 

10 elements from yeast or other fungi s»»ch as the Gal 4 promoter, the ADC (alcohol 
dehydrogenase) promoter, PGK (phosphoglycerol kinase) promoter, alkaline 
phosphatase promoter, and the following animal transcriptional control regions, 
which exhibit tissue specificity and have been utilized in transgenic animals: 
elastase I gene control region which is active in pancreatic acinar cells (Swift et 

15 al., 1984, Cell 38, 639-646; Ornitz et al., 1986, Cold Spring Harbor Symp. 
Quant. Biol. 50, 399-409; MacDonald, 1987, Hepatology 7, 425-515); insulin 
gene control region which is active in pancreatic beta cells (Hanahan, 1985, 
Nature 315, 115-122), immunoglobulin gene control region which is active in 
lymphoid cells (Grosschedl et al., 1984, Cell 38, 647-658; Adames et al., 1985, 

20 Nature 318, 533-538; Alexander et al., 1987, Mol. Cell. Biol. 7, 1436-1444), 
mouse mammary tumor virus control region which is active in testicular, breast, 
lymphoid and mast cells (Leder et al., 1986, CHI 45 , 485-495), albumin gene 
control region which is active in liver (Pinkert et al., 1987, Genes and Devel. 1 
268-276), alpha-fetoprotein gene control region which is active in liver (Krumlauf 

25 et al., 1985, Mol. Cell. Biol. 5, 1639-1648; H^mer et al., 1987, Science 235, 
53-58; alpha 1-antitrypsin gene control region which is active in the liver (Kelsey 
et al., 1987, Genes and Devel. 1, 161-171), beta-globin gene control region 
which is active in myeloid cells (Mogram et al., 1985, Nature 315, 338-340; 
Kollias et al., 1986, Cell 46, 89-94; myelin basic protein gene control region 

30 which is active in oligodendrocyte cells in the brain (Readhead et al., 1987, Cell 
48, 703-712); myosin light chain-2 gene control region which is active in skeletal 
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muscle (Sani, 1985, Nature 314, 283-286), and gonadotropic releasing hormone 
gene control region which is active in the hypothalamus (Mason et al., 1986, 
Science 234, 1372-1378). 

Expression vectors containing Notch gene inserts can be identified 
5 by three general approaches: (a) nucleic acid hybridization, (b) presence or 

absence of "marker" gene functions, and (c) expression of inserted sequences. In 
the first approach, the presence of a foreign gene inserted in a.- axp r ession vector 
can be detected by nucleic acid hybridization using probes comprising sequences 
that are homologous to an inserted toporythmic gene. In the second approach, the 

10 recombinant vector/host system can b* identified and selected based upon the 
presence or absence of certain "marker* gene functions (e.g., thymidine kinase 
activity, resistance to antibiotics, transformation phenotype, occlusion body 
formation in baculovirus, etc.) caused by the insertion of foreign genes in the 
vector. For example, if the Notch gene is inserted within the marker gene 

15 sequence of the vector, recombinants containing the Notch insert can be identified 
by the absence of the marker gene function. In the third approach, recombinant 
expression vectors can be identified by assaying the foreign gene product 
expressed by the recombinant. Such assays can be based, for example, on the 
physical or functional properties of the Notch gene product in in vitro assay 

20 systems, e.g.. aggregation (adhesive) ability (see Sections 6-7, infra). 

Once a particular recombinant DNA molecule is identified and 
isolated, sevenu methods known in the art may be used to propagate it. Once a 
suitable host system and growth conditions are established, recombinant 
expression vectors can be propagated and prepared in quantity. As previously 

25 explained, the expression vectors which can be used include, but are not limited 
to, the following vectors or their derivatives: human or animal viruses such as 
vaccinia virus or adenovirus; insect viruses such as baculovirus; yeast vectors; 
bacteriophage vectors (e.g., lambda), and plasmid and cosmid DNA vectors, to 
name but a few. 

30 in addition, a host cell strain may be chosen which modulates the 

expression of the inserted sequences, or modifies and processes the gene product 
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in the specific fashion desired. Expression from certain promoters can be 
elevated in the presence of certain inducers; thus, expression of the genetically 
engineered Notch protein may be controlled. Furthermore, different host cells 
have characteristic and specific mechanisms for the translational and post- 
radiational processing and modification (e.g., glycosylation, cleavage) of 
proteins. Appropriate cell lines or host systems can be chosen to ensure the 
desired modification and processing of the foreign protein expressed. For 
example, expression in a bacterial system can be used to produce an 
unglycosylated core protein product. Expression in yeast will produce a 
glycosylated product. Expression in mammalian cells can be used to ensure 
"native" glycosylation of a heterologous mammalian toporythmic protein. 
Furthermore; different vector/host expression systems may effect processing 
reactions such as proteolytic cleavages to different extents. 

In other specific embodiments, the Notch protein, fragment, 
analog, or derivative may be expressed as a fusion, or chimeric protein product 
(comprising the protein, fragment, analog, or derivative joined via a peptide bond 
to a heterologous protein sequence (of a different protein)). Such a chimeric 
product can be made by ligating the appropriate nucleic acid sequences encoding 
the desired amino acid sequences to each other by methods known in the art, in 
the proper coding frame, and expressing the chimeric product by methods 
commonly known in the art. Alternatively, such a chimeric product may be made 
by protein synthetic techniques, e.g., by use of a peptide synthesizer. 

Both cDNA and genomic sequences can be cloned and expressed. 

In other embodiments, a Notch cDNA sequence may be 
chromosomally integrated and expressed. Homologous recombination procedures 
known in the art may be used. 

5.8.1. IDENTIFICATION AND PURIFICATION 
OF THE EXPRESSED GENE PRODUCT 

Once a recombinant which expresses the Notch gene sequence is 

identified, the gene product may be analyzed. This can be achieved by assays 
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based on the physical or functional properties of the product, including 
radioactive labelling of the product followed by analysis by gel electrophoresis. 

Once the Notch protein is identified, it may be isolated and 
purified by standard methods including chromatography {e.g., ion exchange, 
5 affinity, and sizing column chromatography), centrifugation, differential 

solubility, or by any other standard technique for the purification of proteins. 
The functional properties may be evaluated using any suitable assay, including, 
but not limited to, aggregation assays (see Sections 6-7). 

10 5.9. DERIVATIVES AND ANALOGS OF NOTCH 

AND OTHER TOPORYTHMIC PROTEINS 

The invention further provides, as Therapeutics, derivatives 

(including but not limited to fragments) and analogs of Notch proteins. Also 

provided as Therapeutics are other toporythmic proteins and derivatives and 

15 analogs thereof, or Notch ligands, in particular, which promote or, alternatively, 
inhibit the interactions of such other toporythmic proteins with Notch. 

The production and use of derivatives and analogs related to Notch 
are within the scope of the present invention. In a specific embodiment, the 
derivative or analog is functionally active, i.e., capable of exhibiting one or more 

2Q functional activities associated with a full-length, wild-type Notch protein. As 
one example, such derivatives or analogs which have the desired antigenicity can 
be used, for example, in diagnostic immunoassays as described in Section 5.3. 
Molecules which retain, or alternatively inhibit, a desired Notch property, e.g., 
binding to Delta or other toporythmic proteins, binding to a intracellular ligand, 

25 can be used therapeutically as inducers, or inhibitors, respectively, of such 

property and its physiological correlates. Derivatives or analogs of Notch can be 
tested for the desired activity by procedures known in the art, including but not 
limited to the assays described infra. In one specific embodiment, peptide 
libraries can be screened to select a peptide with the desired activity; such 

3Q screening can be carried out by assaying, e.g., for binding to Notch or a Notch 
binding partner such as Delta. 
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In particular, Notch derivatives can be made by altering Notch 
sequences by substitutions, additions or deletions that provide for functionally 
equivalent molecules. Due to the degeneracy of nucleotide coding sequences, 
other DNA sequences which encode substantially the same amino acid sequence 
as a Notch gene may be used in the practice of the present invention. These 
include but are not limited to nucleotide sequences comprising all or portions of 
Notch genes which are altered by the substitution of different codons that encode 
a functionally equivalent amino acid residue within the sequence, thus producing a 
silent change. Likewise, the Notch derivatives of the invention include, but are 
not limited to, those containing, as a primary amino acid sequence, all or part of 
the amino acid sequence of a Notch protein including altered sequences in which 
functionally equivalent amino acid residues are substituted for residues within the 
sequence resulting in a silent change. For example, one or more amino acid 
residues within the sequence can be substituted by another amino acid of a similar 
polarity which acts as a functional equivalent, resulting in a silent alteration. 
Substitutes for an amino acid within the sequence may be selected from other 
members of the class to which the amino acid belongs. For example, the 
nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, 
proline, phenylalanine, tryptophan and methionine. The polar neutral amino acids 
include glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine. 
The positively charged (basic) amino acids include arginine, lysine and histidine. 
The negatively charged (acidic) amino acids include aspartic acid and glutamic 
acid. 

Derivatives or analogs of Notch include but are not limited to 
those peptides which are substantially homologous to Notch or fragments thereof, 
or whose encoding nucleic acid is capable of hybridizing to a Notch nucleic acid 
sequence. 

The Notch derivatives and analogs of the invention can be 
produced by various methods known in the art. The manipulations which result 
in their production can occur at the gene or protein level. For example, the 
cloned Notch gene sequence can be modified by any of numerous strategies 
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known in the art (Maniatis, T., 1989 ? Molecular Cloning, A Laboratory Manual, 
2d ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, New York). The 
sequence can be cleaved at appropriate sites with restriction endonuclease(s), 
followed by further enzymatic modification if desired, isolated, and ligated in 

5 vitro. In the production of the gene encoding a derivative or analog of Notch, 
care should be taken to ensure that the modified gene remains within the same 
translationa! reading frame as Notch, uninterrupted by translational stop signals, 
in the gene region where the desired Notch activity is encoded. 

Additionally, the Notch-encoding nucleic acid sequence can be 

10 mutated in vitro or in vivo, to create and/or destroy translation, initiation, and/or 
termination sequences, or to create variations in coding regions and/or form new 
restriction endonuclease sites or destroy preexisting ones, to facilitate further in 
vitro modification. Any technique for mutagenesis known in the art can be used, 
including but not limited to, in vitro site-directed mutagenesis (Hutchinson, C, et 

J 5 al., 1978, J. Biol. Chem 253:6551), use of TAB* linkers (Pharmacia), etc. 

Manipulations of the Notch sequence may also be made at the 
protein level. Included within the scope of the invention are Notch protein 
fragments or other derivatives or analogs which are differentially modified during 
or after translation, e.g., by glycosylation, acetylation, phosphorylation, 

20 amidation, derivatization by known protecting/blocking groups, proteolytic 

cleavage, linkage to an antibody molecule or other cellular ligand, etc. Any of 
numerous chemical modifications may be carried out by known techniques, 
including but not limited to specific chemical cleavage by cyanogen bromide, 
trypsin, chymotrypsin, papain, V8 protease, NaBH 4 ; acetylation, fonmylation, 

25 oxidation, reduction; metabolic synthesis in the presence of tunicamycin; etc. 

In addition, analogs and derivatives of Notch can be chemically 
synthesized. For example, a peptide corresponding to a portion of a Notch 
protein which comprises the desired domain, or which mediates the desired 
aggregation activity in vitro, or binding to a receptor, can be synthesized by use 

30 of a peptide synthesizer. Furthermore, if desired, nonclassical amino acids or 
chemical amino acid analogs can be introduced as a substitution or addition into 
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the Notch sequence. Non-classical amino acids include but are not limited to the 
D-isomers of the common amino acids, a-amino isobutyric acid, 4-aminobutyric 
acid, hydroxyproline, sarcosine, citrulline, cysteic acid, t-butylglycine, 
t-butylalanine, phenylglycine, cyclohexylalanine, 0-alanine, designer amino acids 
5 such as 0-methyl amino acids, Ca-methyl amino acids, and Na-methyi amino 
acids. 

In a specific embodiment, the Notch derivative • a chimeric, or 
fusion, protein comprising a Notch protein or fragment thereof fused via a peptide 
bond at its amino- and/or carboxy-terminus to a non-Notch amino acid sequence. 

10 In one embodiment, such a chimeric protein is produced by recombinant 
expression of a nucleic acid encoding the protein (comprising a Notch-coding 
sequence joined in-frame to a non-Notch coding sequence). Such a chimeric 
product can be made by ligating the appropriate nucleic acid sequences encoding 
the desired amino acid sequences to each other by methods known in the art, in 

15 the proper coding frame, and expressing the chimeric product by methods 

commonly known in the art. Alternatively, such a chimeric product may be made 
by protein synthetic techniques, e.g., by use of a peptide synthesizer. In a 
specific embodiment, a chimeric nucleic acid encoding a mature Notch protein 
with a heterologous signal sequence is expressed such that the chimeric protein is 

20 expressed and processed by the cell to the mature Notch protein. As another 
example, and not by way of limitation, a recombinant molecule can be 
constructed according to the invention, comprising coding portions of both Notch 
and another toporythmic gene, e.g. , Delta. The encoded protein of such a 
recombinant molecule could exhibit properties associated with both Notch and 

25 Delta and portray a novel profile of biological activities, including agonists as 
well as antagonists. The primary sequence of Notch and Delta may also be used 
to predict tertiary structure of the molecules using computer simulation (Hopp and 
Woods, 1981, Proc. Natl. Acad. Sci. U.S.A. 78:3824-3828); Notch / Delta 
chimeric recombinant genes could be designed in light of correlations between 

30 tertiary structure and biological function. Likewise, chimeric genes comprising 
portions of Notch fused to any heterologous (non-Notch) protein-encoding 
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sequences may be constructed. A specific embodiment relates to a chimeric 
protein comprising a fragment of Notch of at least six amino acids. 

In another specific embodiment, the Notch derivative is a fragment 
of Notch comprising a region of homology with another toporythmic protein. As 
used herein, a region of a first protein shall be considered -homologous" to a 
second protein when the amino acid sequence of the region is at least 30% 
identical or at least 75% either identical or involving conservative changes, when 
compared to any sequence in the second protein of an equal number of amino 
acids as the number contained in the region. 

Derivatives of Serrate, Delta, other toporythmic proteins, and the 
adhesive portions thereof, can be made by methods similar to those described 
supra. 

5.9.1. DERIVATIVES OF NOTCH CONTAINING 

ONE OR MORE DOMAINS OF THE PROTEIN 

In a specific embodiment, the invention provides Therapeutics that 
are Notch derivatives and analogs, in particular Notch fragments and derivatives 
of such fragments, that comprise one or more domains of the Notch protein, 
including but not limited to the extracellular domain, transmembrane domain, 
intracellular domain, membrane-associated region, one or more of the EGF-like 
repeats (ELR) of the Notch protein, the cdclO repeats, and the Notch /lin-12 
repeats. In specific embodiments, the Notch derivative may lack all or a portion 
of the ELRs, or one or more other regions of the protein. 

In a specific embodiment, relating to a Notch protein of a species 
other than D. melann^aster, preferably human, fragments. comprising specific 
portions of Notch are those comprising portion* in the respective Notch protein 
most homologous to specific fragments of the Drosophila Notch protein 
ELR 11 and ELR 12). 
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5.9.2. DERIVATIVES OF NOTCH OR OTHER 

TOPORYTHMIC PROTEINS THAT MEDIATE 
BINDING TO TOPORYTHMIC PROTEIN 
DOMAINS. AND INHIBITORS THEREOF 

The invention also provides Notch fragments, and analogs or 

5 derivatives of such fragments, which mediate binding to toporythmic proteins (and 

thus are termed herein "adhesive"), and nucleic acid sequences encoding the 
foregoing. 

Also included as Therapeutics of the invention are toporythmic 
{e.g. j Delta, Serrate) protein fragments, and analogs or derivatives thereof, which 
10 mediate heterotypic binding to Notch (and thus are termed herein H adhesive H ), 
and nucleic acid sequences relating to the foregoing. 

Also included as TVerapeutics of the invention are inhibitors (e.g., 
peptide inhibitors) of the foregoing toporythmic protein interactions with Notch. 

The ability to bind to a toporythmic protein can be demonstrated 
15 by in vitro aggregation assays with cells expressing such a toporythmic protein as 
well as cells expressing Notch or a Notch derivative (See Section 6). That b, the 
ability of a protein fragment to bind to a Notch protein can be demonstrated by 
detecting the ability of the fragment, when expressed on the surface of a first cell, 
to bind to a Notch protein expressed on the surface of a second cell. Inhibitors of 
20 the foregoing interactions can be detected by their ability to inhibit such 
aggregation in vitro. 

The nucleic acid sequences encoding toporythmic proteins or 
adhesive domains thereof, for use in such assays, can be isolated from human, 
porcine, bovine, feline, avian, equine, canine, or insect, as well as primate 
25 sources and any other species in which homolr^s of known toporythmic genes 
can be identified. 

In a specific embodiment, the adhesive fragment of Notch is that 
comprising the portion of Notch most homologous to ELR 11 and 12, i.e., amino 
acid numbers 447 through 527 (SEQ ID NO: 14) of the Drosophila Notch 
30 sequence (see Figure 4). In yet another specific embodiment, the adhesive 

fragment of Delta mediating binding to Notch is that comprising the portion of 
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Delta most homologous to about amino acid numbers 1-230 of the Drosophila 
Delta sequence (SEQ ID NO:2). In a specific embodiment relating to an adhesive 
fragment of Serrate, such fragment is that comprising the portion of Serrate most 
homologous to about amino acid numbers 85-283 or 79-282 of the Drosophila 
5 Serrate sequence (see Figure 5 (SEQ ID NO:4)). 

Due to the degeneracy of nucleotide coding sequences, other DNA 
sequences which encode substantially the same amino acid sequence as the 
adhesive sequences may be used in the practice of the present invention. These 
include but are not limited to nucleotide sequences comprising all or portions of 

10 the Notch . Delta , or Serrate genes which are altered by the substitution of 

different codons that encode a functionally equivalent amino acid residue within 
the sequence, thus producing a silent change. Likewise, the adhesive protein 
fragments or derivatives thereof, of the invention include, but are not limited to, 
those containing, as a primary amino acid sequence, all or part of the amino acid 

15 sequence of the adhesive domains including altered sequences in which 

functionally equivalent amino acid residues are substituted for residues within the 
sequence resulting in a silent change. 

Adhesive fragments of toporythmic proteins and potential 
derivatives, analogs or peptides related to adhesive toporythmic protein 

20 sequences, can be tested for the desired binding activity e.g., by the in vitro 
aggregation assays described in the examples herein. Adhesive derivatives or 
adhesive analog of adhesive fragments of toporythmic proteins include but are 
not limited to those peptides which are substantially homologous to the adhesive 
fragments, or whose encoding nucleic acid is capable of hybridizing to the nucleic 

25 acid sequence encoding the adhesive fragments, and which peptides and peptide 
analogs have positive binding activity e.g., as tested in vitro by an aggregation 
assay such as described in the examples sections infra. Such derivatives and 
analogs are envisioned as Therapeutics and are within the scope of the present 
invention. 

30 The adhesive-protein related derivatives, analogs, and peptides of 

the invention can be produced by various methods known in the art. The 
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manipulations which result in their production can occur at the gene or protein 
level (see Section 5.6). 

Additionally, the adhesive-encoding nucleic acid sequence can be 
mutated in vitro or in vivo; and manipulations of the adhesive sequence may also 
5 be made at the protein level (see Section 5.6). 

In addition, analogs and peptides related to adhesive fragments can 
be chemically synthesized. 

5.10. ASSAYS OF NOTCH PROTEINS, 
1Q DERIVATIVES AND ANALOGS 

The in vitro activity of Notch proteins, derivatives and analogs, 

and other topprythmic proteins which bind to Notch, can be assayed by various 

methods. 

For example, in one embodiment, where one is assaying for the 

1S ability to bind or compete with wild-type Notch for binding to anti-Notch 

antibody, various immunoassays known in the art can be used, including but not 
limited to competitive and non-competitive assay systems using techniques such as 
radioimmunoassays, ELISA (enzyme linked immunosorbent assay), "sandwich" 
immunoassays, immunoradiometric assays, gel diffusion precipitin reactions, 

2q immunodiffusion assays, m situ immunoassays (using colloidal gold, enzyme or 
radioisotope labels, for example), western blots, precipitation reactions, 
agglutination assays (e.g., gel agglutination assays, hemagglutination assays), 
complement fixation assays, immunofluorescence assays, protein A assays, and 
immunoelectrophoresis assays, etc. In one embodiment, antibody binding is 

25 detected by detecting a label on the primary antibody. In another embodiment, 
the primary antibody is detected by detecting binding of a secondary antibody or 
reagent to the primary antibody. In a further embodiment, the secondary 
antibody is labelled. Many means are known in the art for detecting binding in 
an immunoassay and are within the scope of the present invention. 

2Q In another embodiment, where one is assaying for the ability to 

mediate binding to Notch, one can carry out an in vitro aggregation assay such as 
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described infra in Section 6 or 7 (see also Fehon et al., 1990, Cell 61:523-534; 
Rebay et al., 1991, Cell 67:687-699). 

In another embodiment, where another ligand for Notch is 
identified, ligand binding can be assayed, e.g., by binding assays well known in 
the art. In another embodiment, physiological correlates of ligand binding to cells 
expressing a Notch receptor (signal transduction) can be assayed. 

In another embodiment, in insect or other moL- systems, genetic 
studies can be done to study the phenotypic effect of a Notch mutant that is a 
derivative or analog of wild-type Notch. 

Other methods will be known to the skilled artisan and are within 
the scope of the invention. 

5.11. ANTIBODIES TO NOTCH PROTEINS, 
DERIVATIVES AND ANALOGS 

According to one embodiment of the invention, antibodies and 
fragments containing, the binding domain thereof, directed against Notch are 
Therapeutics. Accordingly, Notch proteins, fragments or analogs or derivatives 
thereof, in particular, human Notch proteins or fragments thereof, may be used as 
immunogens to generate anti-Notch protein antibodies. Such antibodies can be 
polyclonal, monoclonal, chimeric, single chain, Fab fragments, or from an Fab 
expression library. In a specific embodiment, antibodies specific to EGF^like 
repeats 11 and 12 of Notch may be prepared. In other embodiments, antibodies 
reactive with the extracellular domain of Notch can be generated. One example 
of such antibodies may prevent aggregation in an in vitro assay. In another 
embodiment, antibodies specific to human Notch are produced. 

Various procedures known in the art may be used for the 
production of polyclonal antibodies to a Notch protein or peptide. In a particular 
embodiment, rabbit polyclonal antibodies to an epitope of the human Notch 
protein encoded by a sequence depicted in Figure 10 or 11, or a subsequence 
thereof, can be obtained. For the production of antibody, various host animals 
can be immunized by injection with the native Notch protein, or a synthetic 
version, or fragment thereof, including but not limited to rabbits, mice, rats, etc. 
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Various adjuvants may be used to increase the immunological response, 
depending on the host species, and including but not limited to Freund's 
(complete and incomplete), mineral gels such as aluminum hydroxide, surface 
active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil 
5 emulsions, keyhold limpet hemocyanins, dinitrophenol, and potentially useful 
human adjuvants such as BCG (bacille Calmette-Guerin) and corynebacterium 
parvum. 

For preparation of monoclonal antibodies directed toward a Notch 
protein sequence, any technique which provides for the production of antibody 

10 molecules by continuous cell lines in culture may be used. For example, the 
hybridoma technique originally developed by Kohler and Milstein (1975, Nature 
256, 495-497), as well as the trioma technique, the human B-cell hybridoma 
technique (Kozbor et al.. 1983, Immunology Today 4, 72), and the EBV- 
hybridoma technique to produce human monoclonal antibodies (Cole et al., 1985, 

15 in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

Antibody fragments which contain the idiotype (binding domain) of 
the molecule can be generated by known techniques. For example, such fragments 
include but are not limited to: the F(ab'), fragment which can be produced by 
pepsin digestion of the antibody molecule; the Fab' fragments which can be 

20 generated by reducing the disulf.de bridges of the F(ab') 2 fragment, and the Fab 
fragments which can be generated by treating the antibody molecule with papain 

and a reducing agent. 

In the production of antibodies, screening for the desired antibody 
-can be accomplished by techniques known in the art, e.g. ELISA (enzyme-linked 

25 immunosorbent assay). For example, to select antibodies which recognize the 
adhesive domain of a Notch protein, one may assay generated hybridomas for a 
product which binds to a protein fragment containing such domain. For selection 
of an antibody specific to human Notch, one can select on the basis of positive 
binding to human Notch and a lack of binding to Drosophila Notch. 

30 in addition to therapeutic utility, the foregoing antibodies have 

utility in diagnostic immunoassays as described in Section 5.6 supra. 
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Similar procedures to those described supra can be used to make 
Therapeutics which are antibodies to domains of other proteins (particularly 
toporythmic proteins) that bind or otherwise interact with Notch (e.g., adhesive 
fragments of Delta or Serrate). 

t 

6 DOMAINS OF NOTCH MEDIATE 
p ]N n]Nr. WITH DELTA 

Intermolecular association between the products of the M and 
Delia genes was detected by studying the effects of their expression on 
aggregation in Drosophila Schneider's 2 <S2) cells (Fehon et al., 1990, Cell 61, 
523-534). Direct evidence of intermolecular interactions between Notch and 
Delta is described herein, as well u an assay system that can be used in 
dissecting the components of this interaction. Normally nonadhesive Drosophila 
S2 cultured cells that express Notch bind specifically in a calcium-dependent 
manner to cells that express Delta. Furthermore, while cells that express Notch 
do not bind to one another, cells that express Delta do bind to one another, 
suggesting that Notch and Delta can compete for binding to Delta at the cell 
surface. Notch and Delta form detergent-soluble complexes both in cultured cells 
and embryonic cells, suggesting that Notch and Delta interact directly at the 
molecular level in vitro and in vivo. The analyses suggest that Notch and Delta 
proteins interact at the cell surface via their extracellular domains. 

^ i FYPFR1MFNTAL F- OCEDURES 

fi ] ] cvppcgglON CONSTRUCTS 
Expression constructs are described in Fehon et al., 1990, Cell 
61:523-534. Briefly, Notch encoded by the MgHa minigene a cDNA/genomic 
chimeric construct (Ramos et al., 1989, Genetics 123, 337-348) was expressed 
following insertion into pRmHa-3 (Bunch, et al., 1988, Nucl. Acids Res. 16, 
1043-1061). In the resulting construct, designated pMtNMg, the metallothionein 
promoter in pRmHa-3 is fused to Notch sequences starting 20 nucleotides 
upstream of the translation start site. 
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The extracellular Notch construct (ECN1), was derived from a 
Notch cosmid (Ramos et ah, 1989, Genetics 123, 337-348), and has an internal 
deletion of the jNoM coding sequences from amino acids 1790 to 2625 inclusive 
(Wharton et ah, 1985, Cell 43, 567-581), and a predicted frameshift that 
produces a novel 59 amino acid carboxyl terminus. 

For the Dejfi expression construct, the Dll cDNA (Kopczynski et 
al., 1988, Genes Dev. 2, 1723-1735; Figure 1; SEQ ID NO:l), which includes 
the complete coding capacity for Delta, was inserted into the EcoRI site of 
pRmHa-3. This construct was called pMTDU. 



10 



6.1.2. ANTIBODY PREPARATION 
Hybridoma cell line C17.9C6 was obtained from a mouse 
immunized with a fusion protein based on a 2;i kb Sall-HindHI fragment that 
includes coding sequences for most of the intracellular domain of Notch (amino 
15 acids 1791-2504; Wharton et al., 1985, Cell 43, 567-581). The fragment was 
subcloned into pUR289 (Ruther and Muller-Hill, 1983, EMBO J. 2, 1791-1794), 
and then transferred into the pATH 1 expression vector (Dieckmann and 
Tzagoloff, 1985, J. Biol. Chem. 260, 1513-1520) as a Bglll-HindHI fragment. 
Soluble fusion protein was expressed, precipitated by 25% (NH 4 ) 2 S0 4 , 
20 resuspended in 6 M urea, and purified by preparative isoelectric focusing using a 
Rotofor (Bio-Rad) (for details, see Fehon, 1989, Rotofor Review No. 7, Bulletin 
1518, Richmond, California: Bio-Rad Laboratories). 

Mouse polyclonal antisera were raised against the extracellular 
domain of Notch using four BstYl fragments of 0.8 kb (amino acids 237-501: 
25 Wharton et al., 1985, Ceil 43, 567-581), 1.1 kb (a.T.ino acids 501-868), 0.99 kb 
(amino acids 868-1200), and 1.4 kb (amino acids 1465-1935) length, which 
spanned from the fifth EGF-like repeat across the transmembrane domain, singly 
inserted in-frame into the appropriate pGEX expression vector (Smith and 
Johnson, 1988, Gene 67, 31-40). Fusion proteins were purified on glutathione- 
30 agarose beads (SIGMA). Mouse and rat antisera were precipitated with 50% 
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(NH 4 )jSO< and resuspended in PBS (150 mM NaCl, 14 mM Na 2 HP0 4 , 6 mM 
NaH 2 P0 4 ) with 0.02% NaN 3 . 

Hybridoma cell line 201 was obtained from a mouse immunized 
with a fusion protein that includes coding sequences from the extracellular domain 
5 of Delta (Kopczynski et al., 1988, Genes Dev. 2, 1723-1735), including 

sequences extending from the fourth through the ninth EGF-like repeats in Delta 

(amino acids 350-529). 

Rat polyclonal antisera were obtained following immunization with 
antigen derived from the same fusion protein construct. In this case, fusion 

10 protein was prepared by lysis of IPTG-induced cells in SDS-Laemmli buffer 
(Carroll and Laughon, 1987, in DNA Cloning, Volume III, D.M. Glover, ed. 
(Oxford: IRL Press), pp. 89-111), separation of proteins by SDS-PAGE, excision 
of the appropriate band from the gel, and electroelution of antigen from the gel 
slice for use in immunization (Harlow and Lane, 1988, Antibodies: A Laboratory 

15 Manual (Cold Spring Harbor, New York: Cold Spring Harbor Laboratory)). 

6.1.3. CPA 1 . CULTUPF AND TRAN SACTION 
The S2 cell line (Schneider, 1972, J. Embryol. Exp. Morph. 27, 
353-365) was grown in M3 medium (prepared by Hazleton Co.) supplemented 
20 with 2.5 mg/ml Bacto-Peptone (Difco), 1 mg/ml TC Yeastolate (Difco), 1 1 % 
heat-inactivated fetal calf serum (FCS) (Hyclone). and 100 U/ml penicillin-100 
*xg/ml streptomycin-0.25 uglml fungizone (Hazleton). Cells growing in log phase 
at -2 x 10* cells/ml were transacted with 20 /tg of DNA-calcium phosphate 
coprecipitate in 1 ml per 5 ml of culture as previously described (Wigler et al., 
25 1979, Proc. Natl. Acad. Sci. USA 78. 1373-1376), with the exception that BES 
buffer (SIGMA) was used in place of HEPES buffer (Chen and Okayama, 1987, 
Mol. Cell. Biol. 7, 2745-2752). After 16-18 hr, cells were transferred to conical 
centrifuge tubes, pelleted in a clinical centrifuge at full speed for 30 seconds, 
rinsed once with 1/4 volume of fresh complete medium, resuspended in their 
30 original volume of complete medium, and returned to the original flask. 
Transfected cells were then allowed to recover for 24 hr before induction. 
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6.1.4. AflfiR FflATlO N ASSAYS 
Expression of the Notch and Delta metallothionein constructs was 
induced by the addition of CuS0 4 to 0.7 mM. Cells transfected with the ECN1 
construct were treated similarly. Cells were then mixed, incubated under 
5 aggregation conditions, and scored for their ability to aggregate using specific 
antisera and immunofluorescence microscopy to visualize expressing cells. 

Two types of aggregation assays were used. In . he first assay, a 
total of 3 ml of cells (5-10 x 10* cells/ml) was placed in a 25 mr Erlenmeyer flask 
and rotated at 40-50 rpm on a rotary shaker for 24-48 hr at room temperature. 
10 For these experiments, cells were mixed 1-4 hr after induction began and 

induction was continued throughout *e aggregation period. In the second assay, 
-0.6 ml of cells were placed in a 0.6 ml Eppendorf tube (leaving a small bubble) 
after an overnight induction (12-16 hr) at room temperature and rocked gently for 
1-2 hr at 4°C. The antibody inhibition and Ca 2+ dependence experiments were 
15 performed using the latter assay. For Ca 2 * dependence experiments, cells were 
first collected and rinsed in balanced saline solution (BSS) with 11% FCS (BSS- 
FCS; FCS was dialyzed against 0.9% NaCl, 5mM Tris [pH 7.5]) or in Ca 2+ free 
BSS-FCS containing 10 mM EGTA (Snow et al., 1989, Cell 59, 313-323) and 
then resuspended in the same medium at the original volume. For the antibody 
20 inhibition experiments, NojcJHransfected cells were collected and rinsed in M3 
medium and then treated before aggregation in M3 medium for 1 hr at 4°C with a 
1:250 dilution of immune or preimmune sera from each of the four mice 
immunized with fusion proteins containing secants from the extracellular 
domain of Notch (see Antibody Preparation above). 



25 



6.1.5. IMMUNOFLUO RESCENCE 
Cells were collected by centrifugation (3000 rpm for 20 seconds in 
an Eppendorf microcentrifuge) and fixed in 0.6 ml Eppendorf tubes with 0.5 ml 
of freshly made 2% paraformaldehyde in PBS for 10 min at room temperature. 
30 After fixing, cells were collected by centrifugation, rinsed twice in PBS, and 

stained for 1 hr in primary antibody in PBS with 0.1% saponin (SIGMA) and 1% 
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normal goat serum (Pocono Rabbit Farm, Canadensis, PA). Monoclonal antibody 
supernatants Were diluted 1:10 and mouse or rat sera were diluted 1:1000 for this 
step. Cells were then rinsed once in PBS and stained for 1 hr in specific 
secondary antibodies (double-labeling grade goat anti-mouse and goat anti-rat, 

5 Jackson Immunoresearch) in PBS-saponin-normal goat serum. After this 
incubation, cells were rinsed twice in PBS and mounted on slides in 90% 
glycerol, 10% 1 M Tris (pH 8.0), and 0.5% n-propyl gallate. Cells were viewed 
under epifluorescence on a Leitz Orthoplan 2 microscope. 

Confocal micrographs were taken using the Bio-Rad MRC 500 

10 system connected to a Zeiss Axiovert compound microscope. Images were 

collected using the BHS and GHS filter sets, aligned using the ALIGN program, 
and merged using MERGE. Fluorescent bleed-through from the green into the 
red channel was reduced using the BLEED program (all software provided by 
Bio-Rad). Photographs were obtained directly from the computer monitor using 

15 Kodak Ektar 125 film. 

6.1.6. CELL LYSATES. 1MMUNOPRECIPITATIONS, 

ANF> WFSTF.RN BLOTS — 

Nondehaturing detergent lysates of tissue culture and wild-type 
Canton-S embryos were prepared on ice in - 10 cell vol of lysis buffer (300 mM 
NaCl, 50 mM Tris [pH 8.0], 0.5% NP-40, 0.5% deoxycholate, 1 mM CaCl„ 1 
mM MgCl,) with 1 mM phenylmethysulfonyl fluoride (PMSF) and diisopropyl 
fluorophosphate diluted 1:2500 as protease inhibitors. Lysates were sequentially 
triturated using 18G, 21 G, and 25G needles attached to 1 cc tuberculin syringes 
and then centrifuged at full speed in a microfuge 10 min at 4°C to remove 
insoluble material. Immunoprecipitation was performed by adding - 1 fig of 
antibody (1-2 /d of polyclonal antiserum) to 250-500 pi of cell lysate and. 
incubating for 1 hr at 4°C with agitation. To this mixture, 15 ng of goat anti- 
mouse antibodies (Jackson Immunoresearch; these antibodies recognize both 
mouse and rat IgG) were added and allowed to incubate for 1 hr at 4°C with 
agitation. This was followed by the addition of 100 fi\ of fixed Staphylococcus 
aureus (Staph A) bacteria (Zysorbin, Zymed; resuspended according to 
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manufacturer's instructions), which had been collected, washed five times in lysis 
buffer, and incubated for another hour. Staph A-antibody complexes were then 
pelleted by centrifugation and washed three times in lysis buffer followed by two 
15 min washes in lysis buffer. After being transferred to a new tube, precipitated 

5 material was suspended in 50 fi\ of SDS-PAGE sample buffer, boiled immediately 
for 10 min, run on 396-15% gradient gels, blotted to nitrocellulose, and detected 
using monoclonal antibodies and HRP-conjugated goat anti-mouse secondary 
antibodies as previously described (Johansen et a!.. 1989, J. Cell Biol. 109, 2427- 
2440). For total cellular protein samples used on Western blots, cells were 

10 collected by centrifugation, lysed in 10 cell vol of sample buffer that contained 1 
mM PMSF, and boiled immediately. 



6.2. RESULTS 

6 2 1. THE EXPRESSION OF NOTCH AND 
pFl.TA IN CUL TURED CELLS 

To detect interactions between Notch and Delta, we examined the 
behavior of cells expressing these proteins on their surfaces using an aggregation 
assay. We chose the S2 cell line (Schneider, 1972, J. Embryol. Exp. Morph. 27, 
353-365) for these studies. S2 cells express an aberrant fclojch message and no 
detectable Notch due to a rearrangement of the 5* end of the Notch coding 
sequence. These cells also express no detectable Delta. 

Results of Western blot and immunofluorescent analysis clearly 
showed that the Notch and Bella constructs supoort expression of proteins of the 
expected sizes and subcellular localization. 
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: fi 7 n ™ j < t THAT FYPRESS N OTCH AND DF1 TA AGGREGATE 

A simple aggregation assay was used to detect interactions between 
Notch and Delta expressed on the surface of S2 cells. 

S2 cells in log phase growth were separately transfected with 
either the Notch or Delta metallothionein promoter construct. After induction 
with CuS0 4 , transfected cells were mixed in equal numbers and allowed to 
aggregate overnight at room temperature (for details, see Experimental 
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Procedures, Section 6.1). Alternatively, in some experiments intended to reduce 
metabolic activity, cells were mixed gently at 4°C for 1-2 hr. To determine 
whether aggregates had formed, cells were processed for immunofluorescence 
microscopy using antibodies specific for each gene product and differently labeled 
5 fluorescent secondary antibodies. Expressing cells usually constituted less than 
598 of the total cell population because transient rather than stable transformants 
were used. The remaining cells either did not express a given protein or 
expressed at levels too low for detection by immunofluorescence microscopy. As 
controls, we performed aggregations with only a single type of transfected cell. 
10 The results (Fehon et al., 1990, Cell 61:523-534) showed that 

while Notch-expressing (Notch*) cells alone did not form aggregates in the assay, 
Delta-expressing (Delta*) cells did. The tendency for Delta* cells to aggregate 
was apparent even in nonaggregated control samples, where cell clusters of 4-8 
cells that probably arose from adherence between mitotic sister cells commonly 
15 occurred. However, clusters were more common after incubation under 

aggregation conditions (e.g., 19% of Delta* cells in aggregates before incubation 
vs. 37% of Delta* cells in aggregates after incubation), indicating that Delta* 
cells are able to form stable contacts with one another in this assay. 

In remarkable contrast to control experiments with Notch* cells 
20 alone, aggregation of mixtures of Notch* and Delta* cells resulted in the 

formation of clusters of up to 20 or more cells. The fraction of expressing cells 
found in clusters of four or more stained cells after 24 hr of aggregation ranged 
from 32%-54% in mixtures of Notch* and Delta* cells. This range was similar 
to that seen for Delta* cells alone (37%-40%) but very different from that for 
25 Notch* cells alone (only 0%-5%). Although a few clusters that consisted only of 
Delta* cells were found, Notch* cells were never found in clusters of greater than 
four to five cells unless Delta* cells were also present. Again, all cells within 
these clusters expressed either Notch or Delta, even though transfected cells 
composed only a small fraction of the total cell population. At 48 hr, the degree 
30 of aggregation appeared higher (63%-7l %), suggesting that aggregation had not 
yet reached a maximum after 24 hr under these conditions. Also, cells 
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cotransfected with Notch and Ddta constructs (so that all transfected cells express 
both proteins) aggregated in a similar fashion under the same experimental 
conditions. 

Notch involvement in the aggregation process was directly tested 
5 by examining the effect of a mixture of polyclonal antisera directed against fusion 
proteins that spanned almost the entire extracellular domain of Notch on 
aggregation (see Experimental Procedures, Section 6.1). To minimize artifacts 
that might arise due to a metabolic response to patching of surface antigens, 
antibody treatment and the aggregation assay were performed at 4° C in these 
10 experiments. Notch* cells were incubated with either preimmune or immune 
mouse sera for 1 hr. Delta* cells were added, and aggregation was performed for 
1-2 hr. While Notch* cells pretreated with preimmune sera aggregated with 
Delta* cells (in one of three experiments, 23% of the Notch* cells were in 
Notch*-Delta* cell aggregates), those treated with immune sera did not (only 2% 
15 of Notch* cells were in aggregates). This result suggested that the extracellular 
domain of Notch was required for Notch + -Delta* cell aggregation. 

6 2 3. NOTCH-DELTA-MEDIATED AGGREGATION IS 
r^f-HlM DEPENDENT 

The ability of expressing cells to aggregate in the presence or 

absence of Ca 1 * ions was tested to determine whether there is a Ca 2 * ion 

requirement for Notch-Delta aggregation. To minimize possible nonspecific 

effects due to metabolic responses to the removal of Ca 2 *, these experiments were 

performed at 4°C. The results clearly demonstrated a dependence of Notch- 

25 ..Delta-mediated aggregation on exogenous Ca 2 *. 

6.2.4. NOTCH AND DELTA INTERACT 
WITHIN A SINGLE CELL 

The question whether Notch and Delta are associated within the 
membrane of one cell that expresses both proteins was posed by examining the 
30 distributions of Notch and Delta in cotransfected cells. To test whether the 
observed colocalization was coincidental or represented a stable interaction 
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between Notch and Delta, live cells were treated with an excess of polyclonal 
anti-Notch antiserum. This treatment resulted in "patching" of Notch on the 
surface of expressing cells into discrete patches as detected by 
immunofluorescence. There was a distinct correlation between the distributions 
of Notch and Delta on the surfaces of these cells after this treatment, indicating 
that these proteins are associated within the membrane. 

6 2 5 INTERACTIONS WITH DELTA DO NOT REQUIRE 
TUP TNTRACEU ITI AR DOMAIN OF NOTCH 

In addition to a large extracellular domain that contains EGF-like 
repeats, Notch has a sizeable intr^llular (IC) domain of -940 amino acids. 
The IC domain includes a phosphorylation site (Kidd et al., 1989, Genes Dev. 3, 
1113-1129), a putative nucleotide binding domain, a polyglutamine stretch 
(Wharton et al., 1985, Cell 43, 567-581; Kidd, et al., 1986, Mol. Cell. Biol. 6, 
15 3094-3108), and sequences homologous to the yeast cjjcjfl gene, which is 

involved in cell cycle control in yeast (Breeden and Nasmyth, 1987, Nature 329, 
651-654). A variant Notch construct was used from which coding sequences for 
-835 amino acids of the IC domain, including all of the structural features noted 
above, had been deleted (leaving 25 membrane-proximal amino acids and a novel 
20 59 amino acid carboxyl terminus; see Experimental Procedures). 

In aggregation assays, cells that expressed the ECN1 construct 
consistently formed aggregates with Deita + cells, but not with themselves, just as 
was observed for cells that expressed intact Notch. Sharp bands of ECN1 
staining were observed within regions of contact with Delta* cells, again 
25 indicating a localization of ECN1 within regions of contact between cells. To test 
for interactions within the membrane, surface antigen co-patching experiments 
were conducted using cells cotransfected with the ECN1 and Delta constructs. As 
observed for intact Notch, when ECN1 was patched using polyclonal antisera 
against the extracellular domain of Notch, ECN1 and Delta colocalized at the cell 
30 surface. These results demonstrate that the observed interactions between Notch 
and Delta within the membrane do not require the deleted portion of the IC 
domain of Notch and are therefore probably mediated by the extracellular domain. 
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6 2 6 NOTCH AND DELTA FORM DETERGENT-SOLUBLE 

jNlJFRMOLEC in AR COMPLEXES 

The preceding results indicated molecular interactions between 
Notch and Delta present within the same membrane and between these proteins 
expressed on different cells. A further test for such interactions is whether these 
proteins would copreeipitate from nondenaturing detergent extracts of cells that 
express Notch and Delta. If Notch and Delta form a stable intermolecular 
complex either between or within cells, then it should be possible to precipitate 
both proteins from cell extracts using specific antisera directed against one of 
these proteins. This analysis was performed by immunoprecipitating Delta with 
polyclonal antisera from NP-40/deoxycholate lysates (see Experimental 
Procedures) of cells cotransfected with the Notch and Delta constructs that had 
been allowed to aggregate overnight or of 0-24 hr wild-type embryos. 

Coprecipitation of Notch was detected in Delta immunoprecipitates 
from cotransfected cells and embryos. However, coprecipitating Notch appeared 
to be present in much smaller quantities than Delta and was therefore difficult to 
detect. The fact that immunoprecipitation of Delta results in the coprecipitation 
of Notch constitutes direct evidence that these two proteins form stable 
intermolecular complexes in transfected S2 cells and in embryonic cells. 

6.3. DISCUSSION 
i i se of an in vitro aggregation assay that employs normally 
nonadhesive S2 cells showed that cells that express Notch and Delta adhere 
specifically to one another. 

7 EGF REPEATS 11 AND 12 OF NOTCH ARE 
' REQUIRED AND SUFFICIENT FOR 

NfYrrH-DELTA MEPlATF n AfiflREGATION 

The same aggregation assay was used as described in Section 6, 

together with deletion mutants of Notch to identify regions within the extracellular 

30 domain of Notch necessary for interactions with Delta. The evidence shows that 

the EGF repeats of Notch are directly involved in this interaction and that only 
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two (ELR 11 and 12) of the 36 EGF repeals appear necessary. These two EGF 
repeats are sufficient for binding to Delta and that the calcium dependence of 
Notch-Delta mediated aggregation also associates with these two repeats. Finally, 
the two corresponding EGF repeats from the Xenopus homolog of teh also 
mediate aggregation with Delta, implying that not only has the structure of Notch 
been evolutionary conserved, but also its function. These results suggest that 
the extracellular domain of Notch is surprisingly modular, and could potentially 
bind a variety of proteins in addition to Delta. (See Rebay et al., 1991, Cell 
67:687-699.) 



7.1. FYPFRTMFNTAI. PROCEDURES 
7.1.1. FyPPF^"™ CONSTRUCTS 
The constructs described are all derivatives of the full length Notch 
expression construct #1 pMtNMg (see Section 6, supra), and were made as 
15 described (Rebay et al. , 199 1 , Cell 67:687-699). 

7.1.2. rru. rill .TURF AND TRANSA CTION 
The Drosophila S2 cell line was grown and transfected as 
described in Section 6, supra. The Delta-expressing stably transformed S2 cell 
20 line L-49-6-7 (kindly established by L. Cherbas) was grown in M3 medium 
(prepared by Hazleton Co.) supplemented with 11% heat inactivated fetal calf 
serum (FCS) (Hyclone), 100 U/ml penicillin-100 vg/m\ streptomycin-0.25 /ig/ml 
fungizone (Hazleton), 2 x 10" 7 M methotrexate, 0.1 mM hypoxanthine, and 0.016 
mM thymidine. 

25 

7.1.3. AfifiRF.fiATIQN ASSAYS AND IMMUNOFLUORESCENCE 
Aggregation assays and Ca + * dependence experiments were as 
described supra. Section 6. Cells were stained with the anti-Notch monoclonal 
antibody 9C6.C17 and anti-Delta rat polyclonal antisera (details described in 
30 Section 6, supra). Surface expression of Notch constructs in unpermeabilized cells 
was assayed using rat polyclonal antisera raised against the 0.8 kb (amino acids 
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237-501; Wharton et al., 1985, Cell 43, 567-581) BstYI fragment from the 
extracellular domain of Notch. Cells were viewed under epifluorescence on a 
Leitz Orthoplan 2 microscope. 

7.2. RESULTS 

7.2.1. EGF REPEATS 11 AND 12 OF 
NOTCH ARE REQUIRED FOR 
NOTCH-DELTA MEDIATED AGGREGATION 

An extensive deletion analysis was undertaken of the extracellular 
domain of the Notch protein, which was shown (supra, Section 6; Fehon et al., 
1990, Cell 61:523-534) to be involved in Notch-Delta interactions, to identify the 
precise domain of Notch mediating these interactions. The ability of cells 
transfected with the various deletion constructs to interact with Delta was tested 
using the aggregation assay described in Section 6. Briefly, Notch deletion 
constructs were transiently transfected into Drosophila S2 cells, induced with 
CuS0 4 , and then aggregated overnight at room temperature with a small amount 
of cells from the stably transformed Delta expressing cell line L49-6-7(Cherbas), 
yielding a population typically composed of ~ 1 % Notch expressing cells and 
-5% Delta expressing cells, with the remaining cells expressing neither protein. 

Schematic drawings of the constructs tested and results of the 
aggregation experiments are shown in Figure 2. To assay the degree of 
aggregation, cells were stained with antisera specific to each gene product and 
examined with immunofluorescent microscopy. Aggregates were defined as 
clusters of four or more cells containing both Notch and Delta expressing cells, 
and the values shown in Figure 2 represent the percentage of all Notch expressing 
cells found in such clusters. All numbers reflect the average result from at least 
two separate transfection experiments in which at least 100 Notch expressing cell 
units (either single cells or clusters) were scored. 

The initial constructs (#2 DSph and #3 ACIa) deleted large 
portions of the EGF repeats. Their inability to promote Notch-Delta aggregation 
suggested that the EGF repeats of Notch were involved in the interaction with 
Delta. A series of six in-frame Clal restriction sites was used to further dissect 
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the region between EGF repeats 7 and 30. Due to sequence homology between 
repeats, five of the Clal sites occur in the same relative place within the EGF 
repeat, just after the third cysteine, while the sixth site occurs just before the first 
cysteine of EGF repeat 31 (Figure 3). Thus, by performing a partial Clal 
5 digestion and then religating, deletions were obtained that not only preserved the 
open reading frame of the Notch protein but in addition frequently maintained the 
structural integrity and conserved spacing, at least theoretically, of the three 
disulfide bonds in the chimeric EGF repeats produced by the relation (Figure 2, 
constructs #4-14). Unfortunately, the most 3' Clal site was resistant to digestion 
10 while the next most 3' Clal site broke between EGF repeats 30 and 31 . 
Therefore, when various Clal digestion fragments were reinserted into the 
framework of the complete Clal digest (construct #3 ACIa). the overall structure 
of the EGF repeats was apparently interrupted at the 3 ' junction. 

Several points about this series of constructs are worth noting. 
15 First, removal of the Clal restriction fragment breaking in EGF repeats 9 and 17 
(construct #8 AEGF9-17) abolished aggregation with Delta, while reinsertion of 
this piece into construct #3 ACIa, which lacks EGF repeats 7-30, restored 
aggregation to roughly wild type levels (construct #13 ACla+EGF9-17), 
suggesting that EGF repeats 9 through 17 contain sequences important for binding 
20 .. Delta. Second, all constructs in this series (#4-14) were consistent with the 
\ -.ding site mapping to EGF repeats 9 through 17. Expression constructs 
containing these reoeats (t>6, 7, 9, 10, 13) promoted Notch-Delta interactions 
while constructs lacking these repeats (#4, 5, 8. 11. 12. 14) did not. To confirm 
that inability to aggregate with Delta cells was not simply due to failure of the 
25 mutagenic Notch protein to reach the cell surface, but actually reflected the 

deletion of the necessary binding site, cell surface expression of all constructs was 
tested by immunofluorescently staining live transfected cells with antibodies 
specific to the extracellular domain of Notch. All constructs failing to mediate 
Notch-Delta interactions produced a protein that appeared to be expressed 
30 normally at the cell surface. Third, although the aggregation assay is not 

quantitative, two constructs which contained EGF repeats 9-17, #9 AEGF17-26 or 
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most noticeably #10 AEGF26-30, aggregated at a seemingly lower level. Cells 
transfected with constructs #9 AEGF17-26 and 10 AEGF26-30 showed 
considerably less surface staining than normal, although fixed and permeabilized 
cells reacted with the same antibody stained normally, indicating the epitopes 
5 recognized by the antisera had not been simply deleted. By comparing the 

percentage of transfected cells in either permeabilized or live cell populations, it 
was found that roughly 50% of transfected cells for construct #9 AEGF17-26 and 
10% for construct #10 AEGF26-30 produced detectable protein at the cell 
surface. Thus these two constructs produced proteins which often failed to reach 
10 the cell surface, perhaps because of misfolding, thereby reducing, but not 

abolishing, the ability of transfected cells to aggregate with Delta-expressing cells. 

Having mapped the binding site to EGF repeats 9 through 17, 
further experiments (Rebay et al., 1991, Cell 67:687-699) revealed that EGF 
repeat 14 of Notch was not involved in the interactions with Delta modelled by 
15 the tissue culture assay. 

To further map the Delta binding domain within EGF repeats 9- 
17, specific oligonucleotide primers and the PCR technique were used to generate 
several subfragments of this region. Three overlapping constructs, #16, 17 and 
18 were produced, only one of which, #16 ACIa+EGF9-13, when transfected 
20 into S2 cells, allowed aggregation with Delta cells. Construct #19 

ACla+EGF(10-13), which lacks EGF repeat 9, further defined EGF repeats 10- 
13 as the region necessary foi Notch-Delta interactions. 

Constructs #20-24 represented attempts to break this domain down 
even further using the same PCR strategy (see Figure 3). Constructs #20 
25 ACIa+EGF(ll-13), in which EGF repeat 12 is the only entire repeat added, and 
#21 ACIa+EGF(10-12), in which EGF repeat 11 is the only entire repeat added, 
failed to mediate aggregation, suggesting that the presence of either EGF repeat 
11 or 12 alone was not sufficient for Notch-Delta interactions. However, since 
the 3' ligation juncture of these constructs interrupted the overall structure of the 
30 EGF repeats, it was possible that a short "buffer" zone was needed to allow the 
crucial repeat to function normally. Thus for example in construct #19 
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ACla+EGF(10-13), EGF repeat 12 might not be directly involved in binding to 
Delta but instead might contribute the minimum amount of buffer sequence 
needed to protect the structure of EGF repeat 11, thereby allowing interactions 
with Delta. Constructs #22-24 addressed this issue. Constructs #22 

5 ACIa+EGF(10-ll), which did not mediate aggregation, and #23 ACIa+EGF(10- 
12), which did, again suggested that both repeats 11 and 12 are required while the 
flanking sequence from repeat 13 clearly is not. Finally, construct #24 
ACla+EGF(ll-12), although now potentially structurally disrupted at the 5' 
junction, convincingly demonstrated that the sequences from EGF repeat 10 are 

10 not crucial. Thus based on entirely consistent data from 24 constructs, EGF 

repeats 1 1 and 12 of Notch together define the smallest fiinctional unit obtainable 
from this analysis that contains the necessary sites for binding to Delta in 
transfected S2 cells. 

15 7.2.2. EGF REPEATS 11 AND 12 OF NOTCH 

ARE SUFFICIENT FOR NOTCH-DELTA 
MFDIATED AG GREGATION 

The large Clal deletion into which PCR fragments were inserted 

(#3 ACIa) retains roughly 1/3 of the original 36 EGF repeats as well as the three 

Notch/ lin-12 repeats. While these are clearly not sufficient to promote 

20 aggregation, it is possible that they form a necessary framework within which 
specific EGF repeats can interact with Delta. To test whether only a few EGF 
repeats were in fact sufficient to promote aggregation, two constructs were 
designed, #25 AEGF which deleted all 36 EGF repeats except for the first two- 
thirds of repeat 1, and #30 AECN which deleted the entire extracellular portion of 

25 Notch except for the first third of EGF repeat 1 and - 35 amino acids just before 
the transmembrane domain. Fragments which had mediated Notch-Delta 
aggregation in the background of construct #3 ACIa, when inserted into construct 
#25 AEGF, were again able to promote interactions with Delta (constructs #26- 
30). Analogous constructs (#31,32) in which the Notch/ lin-12 repeats were also 

30 absent, again successfully mediated Notch-Delta aggregation. Thus EGF repeats 
11 and 12 appear to function as independent modular units which are sufficient to 
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mediate Notch-Delta interactions in S2 cells, even in the absence of most of the 
extracellular domain of Notch. 

7.2.3. EGF REPEATS 11 AND 12 OF NOTCH 
- MAINTAIN THE CALCIUM DEPENDENCE OF 

5 NOTCH-DELTA MEDIATED AGGREGATION 

, t 

The ability of cells expressing certain deletion constructs to 
aggregate with Delta expressing cells was examined in the presence or absence of 
Ca ++ ions. The calcium dependence of the interaction was preserved in even the 
jq smallest construct, consistent with the notion that the minimal constructs 

containing EGF repeats 11 and 12 bind to Delta in a manner similar to that of full 
length Notch. 

7.2.4. THE DELTA BINDING FUNCTION OF EGF 
REPEATS 11 AND 12 OF NOTCH IS 
15 CONSERVED IN THE XENOPUS 

HOMOmGOF NOTCH 

PCR primers based on the Xenopus Notch sequence (Coffman et 

al., 1990, Science 249, 1438-1441) were used to obtain an -350 bp fragment 

from a Xenopus Stage 17 cDNA library that includes EGF repeats 11 and 12 

2o flanked by half of repeats 10 and 13 on either side. This fragment was cloned 
into construct #3 ACla, and three independent clones were tested for ability to 
interact with Delta in the cell culture aggregation assay. Two of the clones, 
#33a&bACIa+XEGF(10-13), when transfected into S2 cells were able to mediate 
Notch-Delta interactions at a level roughly equivalent to the analogous Drosophila 

25 Notch construct #19£Ha+EGF(10-13), and again in a calcium dependent manner 
(Table III). However, the third clone #33cACIa rXEGF(10-13) failed to mediate 
Notch-Delta interactions although the protein was expressed normally at the cell 
surface as judged by staining live unpermeabilized cells. Sequence comparison of 
the Xenopus PCR product in constructs #33a and 33c revealed a missense 

30 mutation resulting in a leucine to proline change (amino acid #453, Coffman, et 
al., 1990, Science 249, 1438-1441) in EGF repeat 11 of construct #33c. 
Although this residue is not conserved between Drosophila and Xenopus Notch 
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(Figure 8), the introduction of a proline residue might easily disrupt the structure 
of the EGF repeat, and thus prevent it from interacting properly with Delta. 

Comparison of the amino acid sequence of EGF repeats 1 1 and 12 
of Drbsophila and Xenopus Notch reveals a high degree of amino acid identity, 

5 including the calcium binding consensus sequence (Figure 4, SEQ ID NO:l and 
NO:2). However the level of homology is not strikingly different from that 
shared between most of the other EGF repeats, which overall yhibit about 50% 
identity at the amino acid level. This one to one correspondence tetween the 
individual EGF repeats oiDrosophila and Xenopus Notch, together with the 

10 functional conservation of ELR 1 1 and suggests that the 36 EGF repeats of 
Notch comprise a tandem area of conserved functional units. 

7.3. DISCUSSION 
An extensive deletion analysis of the extracellular domain of Notch 
15 was used to show that the regions v Notch containing EGF-homologous repeats 

11 and 12 are both necessary and sufficient for Notch-Delta-mediated 
aggregation, and that this Delta binding capability has been conserved in the same 
two EGF repeats of Xenopus Notch. The finding that the aggregation mapped to 
EGF repeats 11 and 12 of Notch demonstrates that the EGF repeats of Notch also 

20 function as specific protein binding domains. EGF repeats 11 and 12 alone 
(#32AECN+EGF(11-12» were sufficient to maintain the Ca ++ dependence of 

Notch-Delta interactions. 

The various deletion constructs suggest that ELR 11 and ELR 12 
function as a modular unit, independent of the immediate context into which they 
25 are placed. Thus, neither the remaining 34 EGF r.peats nor the three NjacMin- 

12 repeats appear necessary to establish a structural framework required for EGF 
repeats 11 and 12 to function. Interestingly, almost the opposite effect was 
observed: although the aggregation assay does not measure the strength of the 
interaction, as the binding site was narrowed down to smaller and smaller 

30 fragments, an increase was observed in the ability of the transfected cells to 
aggregate with Delta expressing cells, suggesting that the normal flanking EGF 
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sequences actually impede association between the proteins. The remaining 34 
EGF repeats may also form modular binding domains for other proteins 
interacting with Notch at various times during development. 

The finding that EGF repeats 11 and 12 of Notch form a discrete 

5 Delta binding unit represents the first concrete evidence supporting the idea that 
each EGF repeat or small subset of repeats may play a unique role during 
development, possibly through direct interactions with other proteins. The 
homologies seen between the adhesive domain of Delta and Serrate (Figure 5) 
suggest that the homologous portion of Serrate is "adhesive" in that it mediates 

10 binding to other toporythmic proteins (se* Section 8, infra). In addition, the gene 
scabrous , which encodes a secreted protein with similarity to fibrinogen, may 

interact with Notch. 

In addition to the EGF repeat, multiple copies of other structural 
motifs commonly occur in a variety of proteins. One relevant example is the 

15 cdclO/ankyrin motif, six copies of which are found in the intracellular domain of 
Notch. Ankyrin contains 22 of these repeats. Perhaps repeated arrays of 
structural motifs may in general represent a linear assembly of a series of 
modular protein binding units. Given these results together with the known 
structural, genetic and developmental complexity of Notch, Notch may interact 

20 with a number of different ligands in a precisely regulated temporal and spacial 
pattern throughout development. Such context specific interactions with 
extracellular protein could be mediated by the EGF and N^ch,/]in.-12 repeats, 
while interactions with cytoskeletal and cytoplasmic proteins could be mediated by 
the intracellular cjlclO/ankyrin motifs. 



25 



30 



8. SEQUENCES WHICH MEDIATE 
NDTCH-SERRAT F 1NTFR ACTIONS 

As described herein, the two EGF repeats of Notch which mediate 

interactions with Delta, namely EGF repeats 11 and 12, also constitute a Serrate 

binding domain (see Rebay et al.. 1991, Cell 67:687-699). 
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To test whether Notch and Serrate directly interact, S2 cells were 
transfected with a Serrate expression construct and mixed with Notch expressing 
cells in an aggregation assay. For the Serrate expression construct, a synthetic 
primer containing an artificial BamHI site immediately 5' to the initiator AUG at 

5 position 442 (all sequence numbers are according to Fleming et al., 1990, Genes 
& Dev. 4:2188-2201) and homologous through position 464, was used in 
conjunction with a second primer from position 681-698 to generate a DNA 
fragment of -260 base pairs. This fragment was cut with BamHI and Kpnl 
(position 571) and ligated into Bluescript KS+ (Stratagene). This construct, 

10 BTSer5'PCR, was checked by sequencing, then cut with Kpnl. The Serrate Kpnl 
fragment (571 - 2981) was inserted and the proper orientation selected, to 
generate BTSer5'PCR-Kpn. The 5' SacII fragment of BTSer5'PCR-Kpn (Sacll 
sites in Bluescript polylinker and in Serrate (1199)) was isolated and used to 
replace the 5' SacII fragment of cDNA CI (Fleming et ah. 1990, Genes & Dev. 

15 4:2188-2201), thus regenerating the full length Serrate cDNA minus the 5' 
untranslated regions. This insert was isolated by a Sail and partial BamHI 
digestion and shuttled into the BamHI and Sail sites of pRmHa-3 to generate the 
final expression construct, Ser-mtn. 

Serrate expressing cells adhered to Notch expressing cells in a 

20 calcium dependent manner (Figure 2 and Rebay et al. , 1991 . supra). However, 
unlike Delta, under the experimental conditions tested. Serrate did not appear to 
interact homotypically. In addition, no interactions were detected between Serrate 
and Delta. 

A subset of Notch deletion constructs were tested, and showed that 
25 EGF repeats 1 1 and 12, in addition to binding to Delta, also mediate interactions 
with Serrate (figure 2; Constructs #1, 7-10, 13, 16, 17, 19. 28, and 32). In 
addition, the Serrate-binding function of these repeats also appears to have been 
conserved in the corresponding two EGF repeats of Xenopus Notch 
(033ACla+XEGF(lO-13)). These results unambiguously show that Notch 
30 interacts with both Delta and Serrate, and that the same two EGF repeats of 
Notch mediate both interactions. The Serrate region which is essential for the 
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Notch/Serrate aggregation was also defined. Deleting nucleotides 676-1287 (i.e. 
amino acids 79-282) (See Figure 5; SEQ ID NO:3 and NO:4) eliminates the 
ability of the Serrate protein to aggregate with Notch. 

Notch and Serrate appear to aggregate less efficiently than Notch 

5 and Delta, perhaps because the Notch-Serrate interaction is weaker. One trivial 
explanation for this reduced amount of aggregation could be that the Serrate 
construct simply did not express as much protein at the cell surface as the Delta 
construct, thereby diminishing the strength of the interaction. Alternatively, the 
difference in strength of interaction may indicate a fundamental functional 

10 difference between Notch-Delta and Notch-Serrate interactions that may be 
significant in vivo. 

9. THE CLONING, SEQUENCING, AND 
EXPRESSION -OF HUMAN NOTCH 

15 9.1. ISOLATION AND SEQUENCI NG! OF HUMAN NOTCH 

Clones for the human Notch sequence were originally obtained 
using the polymerase chain reaction (PGR) to amplify DNA from a 17-18 week 
human fetal brain cDNA library in the Lambda Zap II vector (Stratagene). 

The 4(Wbp fragment obtained in this manner was then used as a 

2 Q probe with which to screen the same library for human Notch clones. The 

original screen yielded three unique clones, hN3k ? hN2K, and hN5k, all of which 
were shown by subsequent sequence analysis to fall in the 3' end of human Notch 
(Figure 6). A second screen using the 5' end of hN3k as probe was undertaken 
to search for clones encompassing the 5' end of human Notch. One unique clone, 

25 hN4k, was obtained from this screen, and preliminary sequencing data indicate 
that it contains most of the 5 f end of the gene- (Figure 6). Together, clones 
hN4k, hN3k and hN5k encompass about 10 kb of the human Notch homo)og(s), 
beginning early in the EGF-repeats and extending into the 3' untranslated region 
of the gene. All three clones are cDNA inserts in the EcoRl site of pBluescript 

30 SK~ (Stratagene). The host £. soli strain is XLl-Blue (see Maniatis, T., 1990, 
Molecular Cloning, A Laboratory Manual, 2d ed., Cold Spring Harbor 
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Laboratory, Cold Spring Harbor, New York, p. A12). An alignment of the 
human Njjtch sequences with Drosophila Notch is shown in Figure 7. 

The sequence of various portions of Notch contained in the cDNA 
clones was determined (by use of Sequenase* U.S. Biochemical Corp.) and is 
5 shown for hN2k and hN4k in Figures 8 (SEQ ID NO:5-7) and 9 (SEQ ID NO:8, 
9), respectively. Further sequence analysis of hN2k revealed that it encodes a 
human Notch sequence overlapping that contained in hN5k. 

The complete nucleotide sequences of the human hJojcb cDNA 
contained in hN3k and hN5k was determined by the dideoxy chain termination 
10 method using the Sequenase* kit (U.S. Biochemical Corp.). Those nucleotide 
sequences encoding human Notch, in the appropriate reading frame, were readily 
identified since there are no introns and translation in only one out of the three 
possible reading frames yields a sequence which, upon comparison with the 
published Drosophila Notch deduced amino acid sequence, yields a sequence with 
IS a substantial degree of homology to the Drosophila Notch sequence. The DNA 
and deduced protein sequences of the human Notch cDNA in hN3k and hN5k are 
presented in Figures 10 (SEQ ID NO:10. 11) and 11 (SEQ ID NO:12, 13), 
respectively. Clone hN3k encodes a portion of a Notch polypeptide starting at 
approximately the third lMi/lin-12 repeat to several amino acids short of the 
20 carboxy-terminal amino acid. Clone hN5k encodes a portion of a Notch 

polypeptide starting approximately before the cdclO region through the end of the 
polypeptide, and also contains a 3' untranslated region. 

Comparing the DNA and protein sequences presented in Figure 10 
(SEQ ID NO: 10, 11) with those in Figure 11 (SEQ ID NO: 12, 13) reveals 
25 significant differences between the sequences, suggesting that hN3k and hN5k 
represent part of two distinct Nojch-homologous genes. The data thus suggest 
that the human genome harbors more than one Notch-homologous gene. This is 
unlike Drosophila, where M appears to be a single-copy gene. 

Comparison of the DNA and amino acid sequences of the human 
30 iipich homologs contained in hN3k and hN5k with the corresponding Drosophila 
Notch sequences (as published in Wharton et al., 1985, Cell 43:567-581) and 
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with the corresponding Xenopus Notch sequences (as published in Coffman et al., 
1990, Science 249:1438-1441 or available from Genbank* (accession number 
M33874)) also revealed differences. 

The amino acid sequence shown in Figure 10 (hN3k) was 
5 compared with the predicted sequence of the TAN-1 polypeptide shown in Figure 
2 of EJIisen et al., August 1991, Cell 66:649-661. Some differences were found 
between the deduced amino acid sequences; however, overall •:?. hN3k Notch 
polypeptide sequence is 99% identical to the corresponding TAN-1 region (TAN- 

1 amino acids 1455 to 2506). Four differences were noted: in the region 
10 between the third Notch /lin-12 repeat and the first cdclO motif, there is an 

arginine (hN3k) instead of an X (TAN-1 amino acid 1763); (2) there is a proline 
(hN3k) instead of an X (TAN-1, amino acid 1787); (3) there is a conservative 
change of an aspartic acid residue (hN3k) instead of a glutamic acid residue 
(TAN-1, amino acid 2495); and (4) the carboxyl-terminal region differs 
15 substantially between TAN-1 amino acids 2507 and 2535. 

The amino acid sequence shown in Figure 11 (hN5k) was 
compared with the predicted sequence of the TAN-1 polypeptide shown in Figure 

2 of Ellisen et al., August 1991, Cell 66:649-661. Differences were found 
between the deduced amino acid sequences. The deduced Notch polypeptide of 

20 hN5k is 79% identical to the TAN-1 polypeptide (64% identical to Drosophila 
Notch) in the cdclO region that encompasses both the cclO motif (TAN-1 amino 
acids 1860 to 2217) and the we l-conserved flanking regions (Fig. 12). The 
cdclO region covers amino acids 1860 through 2217 of the TAN-1 sequence. In 
addition, the hN5k encoded polypeptide is 65% identical to the TAN-1 

25 polypeptide (44% identical to Drosophila Notch) at the carboxy-tenninal end of 
the molecule containing a PEST (proline, glutamic acid, serine, threonine)-rich 
region (TAN-1 amino acids 2482 to 2551) (Fig. 12B). The stretch of 215 amino 
acids lying between the aforementioned regions is not well conserved among any 
of the Notch -homologous clones represented by hN3k, hN5k, and TAN-1. 

30 Neither the hN5k polypeptide nor Drosophila Notch shows significant levels of 
amino acid identity to the other proteins in this region (e.g., hN5k/TAN-l = 
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24% identity; hMk/Drosophila Notch = 11% identity; IMi-MDrosophila Notch 
= 17% identity). In contrast, Xenopus Notch (Xotch) (SEQ ID NO:16), rat 
Notch (SEQ ID NO:17), and TAN-1 (SEQ ID N0:18) continue to share 
significant levels of sequence identity with one another (e.g., TAN-l/rat Notch = 
5 75% identity, TAM/Xenopus Notch = 45% identity, rat Hotch/Xenopus Notch 

= 50% identity). 

Examination of the sequence of the intracellular domains of the 
vertebrate Notch homologs shown in Figure 12B revealed an unexpected finding: 
all of these proteins, including hN5k, contain a putative CcN motif, associated 

10 with nuclear targeting function, in the conserved region following the last of the 
six cdclO repeats (Fig. 12B). Although Drosophila Notch lacks such a defined 
motif, closer inspection of its sequence revealed the presence of a possible 
bipartite nuclear localization sequence (Robbins et al., 1991, Cell 64:615-623), as 
well as of possible CK II and cdc2 phosphorylation sites, all in relative proximity 

15 to one another, thus possibly defining an alternative type of CcN motif (Fig. 
12B). 

To isolate clones covering the 5' end of hN (the human Notch 
homolog contained in part in hN5k), clone hN2k was used as a probe to screen 
260,000 plaques of human fetal brain phage library, commercially available from 

20 stratagene. for crosshybridizing clones. Four clones were identified and isolated 
using standard procedures (Maniatis et al., 1982, Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor Laboratory, Cr'i Spring Harbor, New 
York). Four clones were also isolated by hybridization to the Notch-homologous 
sequence of Adams et al., 1992, Nature 355:632-655. which was obtained from 

25 the ATCC. 

To isolate clones covering the 5' end of TAN-1, the human fetal 
brain library that is commercially available from Stratagene was screened for 
clones which would extend the sequence to the 5' end. 880,000 plaques were 
screened and four clones were identified which crosshybridized with the hN3k 
30 sequences. Sequencing confirmed the relative position of these sequences within 
the Notch protein encoded by TAN-1. 
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The 5' sequence of our isolated TAN-1 homolog has been 
determined through nucleotide number 972 (nucleotide number 1 being the A in 
the ATG initiation codon), and compared to the sequence as published by Ellisen 
efal. (1991, Cell 66:649-661). At nucleotide 559, our TAN-1 homolog has a G, 
5 whereas Ellisen et al. disclose an A, which change results in a different encoded 
amino acid. Thus, within the first 324 amino acids, our TAN-l-encoded protein 
differs from that taught by Ellisen et al., since our protein has a Gly at position 
187, whereas Ellisen et al. disclose an Arg at that position (as presented in Figure 
13.) 

10 The full-length amino acid sequences of both the hN 

(SEQ ID NO: 19) and TAN-l-encoded (SEQ ID NO:20) proteins, as well as 
Xenopus and Drosophila Notch pro eins, are shown in Figure 13. The full-length 
.DNA coding sequence (except for that encoding the initiator Met) (contained in 
SEQ ID NO:21) and encoded amino acid sequence (except that the initiator Met is 

15 not shown) (contained in SEQ ID NO: 19) of hN are shown in Figure 17. 

9.2. EXPRESSION OF HUMAN NOTCH 
Expression constructs were made using the human Notch cDNA 
clones discussed in Section 9.1 above. In the cases of hN3k and hN2k, the entire 

20 clone was excised from its vector as an EcoRI restriction fragment and subcloned 
into the EcoRI restriction . site of each of the three pGEX vectors (Glutathione S- 
Transferase expression vectors; Smith and Johnson, 1988, Gene 7, 31-40). This 
allows for the expression of the Noiui protein product from the subclone in the 
correct reading frame. In the case of hN5k, the clone contains two internal 

25 EcoRI restriction sites, producing 2.6, 1.5 and 0.6 kb fragments. Both the 2.6 
and the 1.5 kb fragments have also been subcloned into each of the pGEX 
vectors. 

The pGEX vector system was used to obtain expression of human 
Notch fusion (chimeric) proteins from the constructs described below. The 
30 cloned Notch DNA in each case was inserted, in phase, into the appropriate 

pGEX vector. Each construct was then electroporated into bacteria (I. coH). and 
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was expressed as a fusion protein containing the Notch protein sequences fused to 
the carboxyl terminus of glutathione S-transferase protein. Expression of the 
fusion proteins was confirmed by analysis of bacterial protein extracts by 
poiyacrylamide gel electrophoresis, comparing protein extracts obtained from 
5 bacteria containing the pGEX plasmids with and without the inserted Notch DNA. 
The fusion proteins wqre soluble in aqueous solution, and were purified from 
bacterial lysates by affinity chromatography using glutathione-coated agarose 
(since the carboxyl verminus of glutathione S-transferase binds to glutathionine). 
The expressed fusion proteins were bound by an antibody to Drosophila Notch, as 
10 assayed by Western blotting. 

The constructs used to rnaice human Notch-glutathione S- 
transferase fusion proteins were as follows: 

hNFP£2 - PCR was used to obtain a fragment starting just before 
the cdclO repeats at nucleotide 192 of the hN5k insert to just before the 
15 PEST-rich region at nucleotide 1694. The DNA was then digested with 

BamHI and Smal and the resulting fragment was ligated into pGEX-3. 
After expression, the fusion protein was purified by binding to glutathione 
agarose. The purified polypeptide was quantitated on a 4-15% gradient 
poiyacrylamide gel. The resulting fusion protein had an approximate 
20 molecular weight of 83 kD. 

hN3FP/Tl - The entire hN3k DNA insert (nucleotide 1 to 3235) 
was excised from the Bluescript (SK) vector by digesting with EcoRI. 
The DNA was ligated into pGEX-3. 

hN3FP#2 - A 3' segment of hN3k DNA (nucleotide 1847 to 3235) 
25 plus some of the poiylinker was cut out of the Bluescript (SK) vector by 

digesting with Xmal. The fragment was ligated into pGEX-1. 

Following purification, these fusion proteins are used to make 
either polyclonal and/or monoclonal antibodies to human Notch. 

30 
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10. NOTCH EXPRESSION IN NORMAL 
ANP MALIGNANT CELLS 

Various human patient tissue samples and cell lines, representing 
both normal and a wide variety of malignant cells are assayed to detect and/or 
5 quantitate expression of Notch. Patient tissue samples are obtained from the 
pathology department at the Yale University School of Medicine. 

The following assays are used to measure Notch expression in 
patient tissue samples: (a) Northern hybridization; (b) Western blots; (c) in situ 
hybridization; and (d) immunocytochemistry. Assays are carried out using 
jq standard techniques. Northern hybridization and in situ hybridization are carried 
out (i) using a DNA probe specific to the Notch sequence of clone hN3k; and (ii) 
using a DNA probe specific to the Notch sequence of clone hN5k. Western blots 
and immunocytochemistry are carried out using an antibody to Drosophila Notch 
protein (which also recognizes human Notch proteins). 

Northern hybridization and Western blots, as described above, are 
also used to analyze numerous human cell lines, representing various normal or 
cancerous tissues. The cell lines tested are listed in Table 2. 



20 Table 2 

HUMAN CELL LINES 
Tissue/TMmqr Cell line 
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Bone marrow IM-9 

KG-l 

Brain A-172 

HS 683 



U-87MG 
TE671 

Breast BT-20 
30 Hs 578Bs 

MDA-MB-330 
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Colon 

Embryo 
Kidney t 

Leukemia 

Liver 

Lung 

Lymphoblasts 

Lymphoma 

\ 

\ 

Melanoma 

Myeloma 
Neuroblastoma 

Ovary 

Plasma Cells 
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Caco-2 
SW 48 
T84 
WiDr 

FHs 173We 

A-498 
A-704 
Caki-2 . 

ARH-77 
KG-1 

Hep G2 
WRL 68 

Calu-1 
HLF-a 
SK-Lu-1 

CCRF-CEM 
HuT78 

Hs 44S 
MSI 16 
U-937 

A-375 
G-361 
Hs 294T 
SK-MEL-1 

IM-9 

RPMI 8226 

IMR-32 

SK-N-SH 

SK-N-MC 

Caov-3 
Caov-4 
PA-1 

ARH-77 
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Sarcoma 

Skin 
Testis . 

1 

Thymus 
Uterus 



10 



A-204 

A673 

HOS 

Amdur II 
BUD-8 

Tera-1 
Tera-2 

Hs67 

AN3 Ca 
HEC-l-A 
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20 



25 



30 



Malignancies of malignant cell tissue types which are thus shown 
to specifically express Notch can be treated as described in Section 5.1 et seq. 

10.1. EXPRESSION OF HUMAN NOTCH PROTEIN 
I fl INCREASED IN VARIOUS MALIGNANCIES 

As described below, we have found that human Notch protein 
expression is increased in at least three human cancers, namely cervical, breast, 
and colon cancer. Immunocytochemical staining of tissue samples from cervical, 
breast, and colon cancers of human patients showed clearly that the malignant 
tissue expresses high levejs of Notch, at increased levels relative to non-malignant 
tissue sections. This broad spectrum of different neoplasias in which there is 
elevated Notch expression suggests that many more cancerous conditions will be 
seen to upregulate Notch, 

Slides of human tumor samples (for breast, colon, and cervical 
tumors) were obtained from the tissue bank of the Pathology Department, Yale 
Medical School. The stainings were done using monoclonal antibodies raised 
against the PI and P4 fusion proteins which were generated from sequences of hN 
and TAN- 1 , respectively . 

The PI and P4 fusion proteins were obtained by insertion of the 
desired human Notch sequence into the appropriate pGEX expression vector 
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(Smith and Johnson, 1988, Gene 7:31-40; AMRAD Corp., Melbourne, Australia) 
and were affinity-purified according to the instructions of the manufacturer 
(AMRAD Corp.). For production of the PI fusion protein, pGEX-2 was cut with 
BamHI and ligated to a concatamer which consists of three copies of a 518 bp 

5 BamHI-BglH fragment of hN. Rats were immunized with the expressed protein 
and monoclonal antibodies were produced by standard procedures. For 
production of the P4 fusion protein, pGEX-2 was cut with BamHI and ligated to a 
concatamer which consists of three copies of a 473 bp BamHI-BgHI fragment of 
TAN- 1. Rats were immunized with the expressed protein, and monoclonal 

10 antibodies were produced by standard procedures. 

In all tumors examined, the Notch proteins encoded by both 
human Notch homologs TAN-1 aw 4 hN were present at increased levels in the 
malignant part of the tissue compared to the normal part. Representative 
stainings are shown in the pictures provided (Figs. 14-16). 

15 The staining procedure was as follows: The tissues were fixed in 

paraformaldehyde, embedded in paraffin, cut in 5 micrometer thick sections and 
placed on glass slides. Then the following steps were carried out: 

1. Deparafinization through 4 changes of xylene, 4 minutes each. 

2. Removal of xylene through 3 changes in absolute ethanol, 4 
20 minutes each. 

3. Gradual rehydration of the tissues by immersing the slides into 
95%, 90%, 80%, 60% and 30% ethanol, 4 minutes each. At the 
end the slides were rinsed in distilled water for 5 minutes. 

4. Quenching of endogenous, peroxidase by incubating for 30 
25 minutes in 0.3% hydrogen peroxide in methanol. 

5. Washing in PBS (10 mM sodium phosphate pH 7.5, 0.9% NaCl) 
for 20 minutes. 

6. Incubation for 1 hour in blocking solution. (Blocking solution: 
PBS containing 4% normal rabbit serum and 0.1 Triton X-100.) 

30 
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11. DEPOSIT OF MICROORGANISMS 
The following recombinant bacteria, each carrying a plasmid 
encoding a portion of human Notch, were deposited on May 2, 1991 with the 
American Type Culture Collection, 1201 Parklawn Drive, Rockville, Maryland 
5 20852, under the provisions of the Budapest Treaty on the International 
Recognition of the Ddposit of Microorganisms for the Purposes of Patent 
Procedures. 

Bacteria carrying Plasmid ATCC Accession NQ- 

10 E. coli XLl-Blue hN4k 68610 

E. fioH XLl-Blue hN3k 68609 

E. coli XLl-Blue hN5k 68611 

The present invention is not to be limited in scope by the 
15 microorganisms deposited or the specific embodiments described herein. Indeed, 
various modifications of the invention in addition to those described herein will 
become apparent to those skilled in the art from the foregoing description and 
accompanying figures. Such modifications are intended to fall within the scope of 
the appended claims. 

20 Various publications are cited herein, the disclosures of which are 

incorporated by reference in their entireties. 
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7. Incubation overnight at 4°C with primary antibody diluted in 
blocking solution. Final concentration of primary antibody 20-50 
/ig/ml. 

8. Washing for 20 minutes with PBS+0. 1 % Triton X-100 (3 
5 changes). 

9. Incubation for 30 minutes with biotinylated rabbit anti-rat 
antibody: 50 /xl of biotinylated antibody (VECTOR) in 10 ml of 
blocking solution. 

10. Washing for 20 minutes with PBS+0. 1 % Triton X-100 (3 
10 changes). 

11. Incubation with ABC reagent (VECTOR) for 30 minutes (the 
reagent is made in PBS+0. 1 % Triton X-100). 

12. Washing for 20 minutes in PBS+0. 1 % Triton X-100. Followed 
by incubation for 2 minutes in PBS+0.5% Triton X-100. 

15 13. Incubation for 2-5 minutes in peroxidase substrate solution. 

Peroxidase substrate solution: Equal volumes of 0.02% hydrogen 
peroxide in distilled water and 0.1% diaminobenzidine 
tetrahydrochloride (DAB) in 0.1 M Tris buffer pH 7.5 are mixed 
just before the incubation with the tissues. Triton X-100 is added 

20 to the final solution at a concentration of 0.5%. 

14. Washing for 15 minutes in tap water. 

15. Counterstaining for 10 minutes with Mayor's hematoxylin. 

16. Washing for 15 minutes in tap water. 

17. Dehydration through changes in 30%, 60%, 80%, 90%, 95% and 
25 absolute ethanol (4 minutes each). 

18. Immersion into xylene (2 changes, 4 minutes each). 

19. Mounting, light microscopy. 
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t 
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(A) ADDRESSEE: Pennie & Edmonds 
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(2) INFORMATION FOR SEQ ID NOtl: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 2892 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: CDNA 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 142.. 2640 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
GAATTCGGAG GAATTATTCA AAACATAAAC ACAATAAACA ATTTGAGTAG TTGCCGCACA 60 
CACACACACA CACAGCCCGT GGATTATTAC ACTAAAAGCG ACACTCAATC CAAAAAATCA 120 



GCAACAAAAA CATCAATAAA C ATG CAT TGG ATT AAA TGT TTA TTA ACA GCA 
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Met His Trp lie Lys Cys Leu Leu Thr Ala 
1 5 10 

TTC ATT TGC TTC ACA GTC ATC GTG CAG GTT CAC AGT TCC GGC AGC TTT 219 

Phe lie Cys Phe Thr Val He Val Gin Val His Ser Ser Gly Ser Phe 
15 20 25 

GAG TTG CGC CTG AAG TAC TTC AGC AAC GAT CAC GGG CGG GAC AAC GAG 267 

Glu Leu Arg Leu Lys Tyr Phe Ser Asn Asp His Gly Arg Asp Asn Glu 
30 35 40 



GGT CGC TGC TGC AGC GGG GAG TCG GAC GGA GCG ACG GGC AAG TGC CTG 
Gly Arg Cys Cys Ser Gly Glu Ser Asp Gly Ala Thr Gly Lys Cys Leu 
45 



50 



55 



GGC AGC TGC AAG ACG CGG TTT CGC GTC TGC CTA AAG CAC TAC CAG GCC 
Gly Ser Cys Lys Thr Arg Phe Arg Val Cys Leu Lys His Tyr Gin Ala 
60 65 70 

ACC ATC GAC ACC ACC TCC CAG TGC ACC TAC GGG GAC GTG ATC ACG CCC 
Thr He Asp Thr Thr Ser Gin Cys Thr Tyr Gly Asp Val He Thr Pro 
75 80 85 90 



315 



363 



411 



ATT CTC GGC GAG AAC TCG GTC AAT CTG ACC GAC GCC CAG CGC TTC CAG 
He Leu Gly Glu Asn Ser Val Asn Leu Thr Asp Ala Gin Arg Phe Gin 
95 100 105 



459 



AAC AAG GGC TTC ACG AAT CCC ATC CAG TTC CCC TTC TCG TTC TCA TGG 
Asn Lys Gly Phe Thr Asn Pro lie Gin Phe Pro Phe Ser Phe Ser Trp 
110 115 120 



507 



CCG GGT ACC TTC TCG CTG ATC GTC GAG GCC TGG CAT GAT ACG AAC AAT 
Pro Gly Thr Phe Ser Leu lie Val Glu Ala Trp His Asp Thr Asn Asn 
125 130 135 



555 



AGC GGC AAT GCG CGA ACC AAC AAG CTC CTC ATC CAG CGA CTC TTG GTG 
Ser Gly Asn Ala Arg Thr Asn Lys Leu Leu He Gin Arg Leu Leu Val 
140 145 150 



603 



CAG CAG GTA CTG GAG GTG TCC TCC GAA TGG AAG ACG AAC AAG TCG GAA 
Gin Gin Val Leu Glu Val Ser Ser Glu Trp Lys Thr Asn Lys Ser Glu 
155 160 165 170 



651 



TCG CAG TAC ACG TCG CTG GAG TAC GAT TTC CGT GTC ACC TGC CAT CTC 
Ser Gin Tyr Thr Ser Leu Glu Tyr Asp Phe Arg Val Thr Cys Asp Leu 
175 180 185 



699 



AAC TAC TAC GGA TCC GCC TGT GCC AAG TTC TGC CGG CCC CGC GAC GAT 747 

Asn Tyr Tyr Gly Ser Gly Cye Ala Lys Phe Cys Arg Pro Arg Asp Asp 

190 195 200 

TCA TTT" GGA CAC TCG ACT TGC TCC GAC ACG GGC GAA ATT ATC TGT TTG 795 

Ser Phe Gly His Ser Thr Cys Ser Glu Thr Gly Glu lie He Cys Leu 
205 210 215 



ACC GGA TGG CAG GGC GAT TAC TGT CAC ATA CCC AAA TGC GCC AAA GGC 843 

Thr Gly Trp Gin Gly Asp Tyr Cys His He Pro Lys Cys Ala Lys Gly 
220 225 230 

TGT GAA CAT GGA CAT TGC GAC AAA CCC AAT CAA TGC GTT TGC CAA CTG 891 

Cys Glu His Gly His Cys Asp Lys Pro Asn Gin Cys Val Cys Gin Leu 
235 240 245 250 

GGC TGG AAG GGA GCC TTG TGC AAC GAG TGC GTT CTG GAA CCG AAC TGC 939 

Gly Trp Lys Gly Ala Leu Cys Asn Glu Cy9 Val Leu Glu Pro Asn Cys 
255 260 265 
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ATC CAT GGC ACC TGC AAC AAA CCC TGG ACT TCC ATC TGC AAC GAG GGT 987 

II© His Gly Thr Cys Asn Lys Pro Trp Thr Cys lie Cys Asn Glu Gly 

270 275 280 

TGG GGA GGC TTG TAC TGC AAC CAG GAT CTG AAC TAC TGC ACC AAC CAC 1035 
Trp Gly Gly Leu Tyr Cys Asn Gin Asp Leu Asn Tyr Cys Thr Asn His 
285 290 295 

AGA CCC TGC AAG AAT GGC GGA ACC TGC TTC AAC ACC GGC GAG GGA TTG 1083 
Arg Pro Cys Lys Asn Gly Gly Thr Cys Phe Asn Thr Gly Glu Gly Leu 
300 305 310 

TAC ACA TGC AAA TGC GCT CCA GGA TAC AGT GGT GAT GAT TGC GAA AAT 1131 
Tyr Thr Cys Lys Cys Ala 'Pro Gly Tyr Ser Gly Asp Asp Cys Glu Asn 
315 320 325 330 

GAG ATC TAC TCC TGC GAT GCC GAT GTC AAT CCC TGC CAG AAT GGT GGT 1179 
Glu lie Tyr Ser Cys Asp Ala Asp Val Asn Pro Cys Gin Asn Gly Gly 
335 340 345 

ACC TGC ATC GAT GAG CCG CAC ACA AAA ACC GGC TAC AAG TGT CAT TGC 1227 
Thr Cys lie Asp Glu Pro His Thr Lys Thr Gly Tyr Lys Cys His Cys 
350 355 360 

GCC AAC GGC TGG AGC GGA AAG ATG TGC GAG GAG AAA GTG CTC ACG TGT 1275 
Ala Asn Gly Trp Ser Gly Lys Met Cys Glu Glu Lys Val Leu Thr Cys 
365 370 375 

TCG GAC AAA CCC TGT CAT CAG GGA ATC TGC CGC AAC GTT CGT CCT GGC 1323 
Ser Asp Lys Pra Cys His Gin Gly He Cys Arg Asn Val Arg Pro Gly 
380 385 390 

TTG GGA AGC AAG GGT CAG GGC TAC CAG TGC GAA TGT CCC ATT GGC TAC 1371 
Leu Gly Ser Lys Gly Gin Gly Tyr Gin Cys Glu Cys Pro He Gly Tyr 
395 400 405 410 

AGC GGA CCC AAC TGC GAT CTC CAG CTG GAC AAC TGC AGT CCG AAT CCA 1419 
Ser Gly Pro Asn Cys Asp Leu Gin Leu Asp Asn Cys Ser Pro Asn Pro 
415 420 425 

TGC ATA AAC GGT GGA AGC TGT CAG CCG AGC GGA AAG TGT ATT TGC CCA 1467 
Cys He Asn Gly Gly Ser Cys Gin Pro Ser Gly Lys Cys He Cys Pro 
430 435 440 

GCG GGA TTT TCG GGA ACG AGA TGC GAG ACC AAC ATT GAC GAT TGT CTT 1515 
Ala Gly Phe Ser Gly Thr Arg Cys Glu Thr Asn He Asp Asp Cys Leu 
445 450 455 

GGC CAC CAG TGC GAG AAC GGA GGC ACC TGC ATA GAT ATG GTC AAC CAA 1563 
Gly His Gin Cys Glu Asn Gly Gly Thr Cys He Asp Met Val Asn Gin 
460 465 470 

TAT CGC TGC CAA TGC GTT CCC GGT TTC CAT GGC ACC CAC TGT AGT AGC 1611 
Tyr Arg Cys Gin Cys Val Pro Gly Phe His Gly Thr His Cys Ser Ser 
475 480 485 490 

AAA GTT GAC TTG TGC CTC ATC AGA CCG TGT GCC AAT GGA GGA ACC TGC 1659 
Lys Val Asp Leu Cys Leu He Arg Pro Cys Ala Asn Gly Gly Thr Cys 
495 500 505 

TTG AAT CTC AAC AAC GAT TAC CAG TGC ACC TGT CGT GCG GGA TTT ACT 1707 
Leu Asn Leu Asn Asn Asp Tyr Gin Cys Thr Cys Arg Ala Gly Phe Thr 
510 515 520 

GGC AAG GAT TGC TCT GTG GAC ATC GAT GAG TGC AGC AGT GGA CCC TGT 1755 
Gly Lys Asp Cys Ser Val Asp He Asp Glu Cys Ser Ser Gly Pro Cys 
525 530 535 
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CAT AAC GGC GGC ACT TGC ATG AAC CGC GTC AAT TCG TTC GAA TGC GTG 1803 
His Asn Gly Gly Thr Cys Met Asn Arg Val Aan Ser Phe Glu Cys Val 
540 545 550 

TGT GCC AAT GGT TTC AGG GGC AAG CAG TGC GAT GAG GAG TCC TAC GAT 1851 
Cys Ala Aon Gly Phe Arg Gly Lys Gin Cys Asp Glu. Glu Ser Tyr Asp 
555 560 565 570 

TCG GTG ACC TTC GAT GCC CAC CAA TAT GGA GCG ACC ACA CAA GCG AGA 1899 
Ser Val Thr Phe Asp Ala His Gin Tyr Gly Ala Thr Thr Gin Ala Arg 
575 580 585 

GCC GAT GGT TTG ACC AAT GCC CAG GTA GTC CTA ATT GCT GTT TTC TCC 1947 
Ala Asp Gly Leu Thr Asn ! Ala Gin Val Val Leu He Ala Val Phe Ser 
590 595 60 

GTT GCG ATG CCT TTG GTG GCG GTT ATT GCG GCG TGC GTG GTC TTC TGC 1995 
Val Ala Met Pro Leu Val Ala Val He Ala Ala Cys Val Val Phe Cys 
605 610 615 

ATG AAG CGC AAG CGT AAG CGT GCT CAG GAA AAG GAC GAC GCG GAG GCC 2043 
Met Lys Arg Lys Arg Lys Arg Ala Gin Glu Lys Asp Asp Ala Glu Ala 
620 625 630 

AGG AAG CAG AAC GAA CAG AAT GCG GTG GCC ACA ATG CAT CAC AAT GGC 2091 
Arg Lys Gin Asn Glu Gin Asn Ala Val Ala Thr Met His His Asn Gly 
635 640 645 650 

AGT GGG GTG GGT GTA GCT TTG GCT TCA GCC TCT CTG GGC GGC AAA ACT 2139 
Ser Gly Val Gly Val Ala Leu Ala Ser Ala Ser Leu Gly Gly Lys Thr 
655 660 665 

GGC AGC AAC AGC GGT CTC ACC TTC OAT GGC GGC AAC CCG AAT ATC ATC 2187 
Gly Ser Asn Ser Gly Leu Thr Phe Asp Gly Gly Asn Pro Asn He He 
670 675 680 

AAA AAC ACC TGG GAC AAG TCG GTC AAC AAC ATT TGT CCC TCA GCA GCA 2235 
Lys Asn Thr Trp Asp Lys \$er Val Asn Asn He Cys Ala Ser Ala Ala 
685 690 695 

GCA GCG GCG GCG GCG GCA GCA GCG GCG GAC GAG TGT CTC ATG TAC GGC 2283 
Ala Ala Ala Ala Ala Ala Ala Ala Ala Asp Glu Cys Leu Met Tyr Gly 
700 70S 710 

GGA TAT GTG GCC TCG GTG GCG GAT AAC AAC AAT GCC AAC TCA GAC TTT 2331 
Gly Tyr Val Ala Ser Val Ala Asp Asn Asn Asn Ala Asn Ser Asp Phe 
715 720 725 730 

TGT GTG GCT CCG CTA CAA AGA GCC AAG TCG CAA AAG CAA CTC AAC ACC 2379 
Cys Val Ala Pro Leu Gin Arg Ala Lys Ser Gin Lys Gin Leu Asn Thr 
735 740 745 

GAT CCC ACG CTC ATG CAC CGC GGT TCG CCG GCA GGC AGC TCA GCC AAG 2427 
Asp Pro Thr Leu Met His Arg Gly Ser Pro Ala Gly Ser Ser Ala Lys 
750 755 760 

GGA GCG TCT GGC GGA GGA CCG GGA GCG GCG GAG GGC AAG ACG ATC TCT 2475 
Gly Ala Ser Gly Gly Gly Pro Gly Ala Ala Glu Gly Lys Arg He Ser 
765 770 775 

GTT TTA GGC GAG GGT TCC TAC TGT AGC CAG CGT TGG CCC TCG TTG GCG 2523 
Val Leu Gly Glu Gly Ser Tyr Cys Ser Gin Arg Trp Pro Ser Leu Ala 
780 785 790 

GCG GCG GGA GTG GCC GGA GCC TGT TCA TCC CAG CTA ATG GCT GCA GCT 2571 
Ala Ala Gly Val Ala Gly Ala Cys Ser Ser Gin Leu Met Ala Ala Ala 
795 800 805 810 
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TCG GCA GCG GGC AGC GGA GCG GGG ACG GCG CAA CAG CAG CGA TCC GTG 2619 
Ser Ala Ala Gly Ser Gly Ala Gly Thr Ala Gin Gin Gin Arg Ser Val 
815 820 825 

GTC TGC GGC ACT CCG CAT ATG TAACTCCAAA AATCCGGAAG GGCTCCTGGT 2670 
Val Cys Gly Thr Pro Hi =s Met 
830 

AAATCCGGAG AAATCCGCAT GGAGGAGCTG ACAGCACATA CACAAAGAAA AGACTGGGTT 2730 

GGGTTCAAAA TGTGAGAGAG ACGCCAAAAT GTTGTTGTTG ATTGAAGCAG TTTAGTCGTC 2790 

ACGAAAAATG AAAAATCTGT AACAGGCATA ACTCGTAAAC TCCCTAAAAA ATTTGTATAG 2850 

TAATTAGCAA AGCTGTGACC CAGCCGTTTC GATCCCGAAT TC 2892 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 833 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met His Trp lie Lys Cys Leu Leu Thr Ala Phe lie Cys Phe Thr Val 

1 5 10 15 

lie Val Gin Val His Ser Ser Gly Ser Phe Glu Leu Arg Leu Lys Tyr 
20 25 30 

Phe Ser Asn Asp His Gly Arg Asp Asn Glu Gly Arg Cys Cys ser Gly 
35 40 45 

Glu Ser Asp Gly Ala The, Gly Lys Cys Leu Gly Ser Cys Lys Thr Arg 
50 55 60 

Phe Arg Val Cys Leu Lys His Tyr Gin Ala Thr lie Asp Thr Thr Ser 
65 70 75 80 

Gin Cys Thr Tyr Gly Asp Val lie Thr Pro lie Leu Gly Glu Asn Ser 
85 90 95 

Val Asn Leu Thr Asp Ala Gin Arg ?he Gin Asn Lys Gly Phe Thr Asn 
100 105 110 

Pro He Gin Phe Pro Phe Ser Phe Ser Trp Pro Gly Thr Phe Ser Leu 
115 120 125 

He Val Glu Ala Trp His Asp Thr Asn Asn Ser Gly Asn Ala Arg Thr 
130 135 140 

Asn Lys Leu Leu He Gin Arg Leu Leu Val Gin Gin Val Leu Glu Val 
145 150 155 160 

Ser Ser Glu Trp Lys Thr Asn Lys Ser Glu Ser Gin Tyr Thr Ser Leu 
165 170 175 

Glu Tyr Asp Phe Arg Val Thr Cys Asp Leu Asn Tyr Tyr Gly Ser Gly 
180 185 190 

Cys Ala Lys Phe Cys Arg Pro Arg Asp Asp Ser Phe Gly His Ser Thr 
195 200 205 
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Cys Ser Glu Thr Gly Glu lie He Cys Leu Thr Gly Trp Gin Gly Asp 
210 215 220 

Tvr Cvs His He Pro Lyg Cys Ala Lys Gly Cys Glu His Gly His Cys 
225 230 235 240 

Asp Lys Pro Asn Gin Cys Val Cys Gin Leu Gly Trp Lys Gly Ala Leu 
245 250 255 

Cys Asn Glu Cys Val Leu Glu Pro Asn Cys He His Gly Thr Cys Asn 
260 265 270 

Lys Pro Trp Thr Cys He Cys Aon Glu Gly Trp Gly Gly Leu Tyr Cys 
275 ' 280 285 

Asn Gin Asp Leu Asn Tyr Cys Thr Asn His Arg Pro Cys Lys Asn Gly 
290 295 300 

Gly Thr Cys Phe Asn Thr Gly Glu Gly Leu Tyr Thr Cys Lys Cys Ala 
305 310 315 320 

Pro Gly Tyr Ser Gly Asp Asp Cys Glu Asn Glu He Tyr Ser Cys Asp 
1 325 330 335 

Ala Asp Val Asn Pro Cys Gin Asn Uy Gly Thr Cys He Asp Glu Pro 
340 -45 350 

His Thr Lys Thr Gly Tyr Lys Cys His Cys Ala Asn Gly Trp Ser Gly 
355 360 365 

Lys Met Cys Glu Glu Lys Val Leu Thr Cys Ser Asp Lys Pro Cys His 
370 375 380 

Gin Gly He Cys Arg Asn Val Arg Pro Gly Leu Gly Ser Lys Gly Gin 
385 390 395 400 

Gly Tyr Gin Cys Glu Cys Pro He Gly Tyr Ser Gly Pro Asn Cys Asp 
405 410 415 

Leu Gin Leu Asp Asn Cys Ser Pro Asn Pro Cys He Asn Gly Gly Ser 
420 425 430 

Cys Gin Pro Ser Gly Lys Cys He Cys Pro Ala Gly Phe Ser Gly Thr 
435 440 445 

Arg Cys Glu Thr Asn He Asp Asp Cys Leu Gly His Gin ~ys Glu Asn 
450 455 460 

Gly Gly Thr Cys He Asp Met Val Asn Gin Tyr Arg Cys Gin Cys Val 
465 470 475 480 

Pro Gly Phe His Gly Thr His Cys Ser Ser Lys Val Asp Leu Cys Leu 
485 490 495 

He Arg Pro Cys Ala Asn Gly Gly Thr Cys Leu Asn Leu Asn Asn Asp 
500 505 510 

Tvr Gin Cys Thr Cys Arg Ala Gly Phe Thr Gly Lys Asp Cys Ser Val 
1 515 520 525 

Asp He Asp Glu Cys Ser Ser Gly Pro Cys His Asn Gly Gly Thr Cys 
530 535 540 

Met Asn Arg Val Asn Ser Phe Glu Cys Val Cys Ala Asn Gly Phe Arg 
545 550 555 560 

Gly Lys Gin Cys Asp Glu Glu Ser Tyr Asp Ser Val Thr Phe Asp Ala 
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565 



570 



575 



His Gin Tyr Gly Ala Thr Thr Gin Ala Arg Ala Asp Gly Leu Thr Asn 
580 585 590 

Ala Gin Val Val Leu He Ala Val Ph Ser Val Ala Met Pro Leu Val 
595 600 605 

Ala Val He Ala Ala Cys Val Val Phe Cya Met Lys Arg Lye Arg Lya 
610 615 620 

Arg Ala Gin Glu Lys Asp Asp Ala Glu Ala Arg Lys Gin Asn Glu Gin 

625 630 635 640 

t 

Asn Ala Val Ala Thr Met His His Asn Gly Ser Gly Val Gly Val Ala 
645 650 655 

Leu Ala Ser Ala Ser Leu Gly Gly Lys Thr Gly Ser Asn Ser Gly Leu 
660 665 670 



Thr Phe Asp Gly Gly Asn Pro Asn He He Lys Asn Thr Trp Asp Lys 
675 680 685 



Ser Val Asn Asn He Cys Ala Ser Ala Ala Ala Ala Ala Ala Ala Ala 
690 695 700 

Ala Ala Ala Asp Glu Cys Leu Met Tyr Gly Gly Tyr Val Ala Ser Val 
705 710 715 720 

Ala Asp Asn Asn Asn Ala Asn Ser Asp Phe Cys Val Ala Pro Leu Gin 
725 730 735 

Arg Ala Lys Ser Gin Lys Gin Leu Asn Thr Asp Pro Thr Leu Met His 
740 745 750 

Arg Gly Ser Pro Ala Gly Ser Ser Ala Lys Gly Ala Ser Gly Gly Gly 
755 760 765 

Pro Gly Ala Ala Glu Gly Lys Arg He Ser Val Leu Gly Glu Gly Ser 
770 775 780 

Tyr Cys Ser Gin Arg Trp Pro Ser Leu Ala Ala Ala Gly Val Ala Gly 
785 790 795 800 

Ala Cys Ser Ser Gin Leu Met Ala Ala Ala Ser Ala Ala Gly Ser Gly 



Ala Gly Thr Ala Gin Gin Gin Arg Ser Val Val Cys Gly Thr Pro His 
820 825 830 

Met 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1320 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 



805 



810 



815 



(2) INFORMATION FOR SEQ ID NO: 3: 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 442.. 1320 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CCGAGTCGAG CGCCGTGCTT CGAGCGGTGA TGAGCCCCTT TTCTGTCAAC GCTAAAGATC 60 

TACAAAACAT CAGCGCCTAT CAAGTGGAAG TGTCAAGTGT GAACAAAACA AAAACGAGAG 120 

AAGCACATAC TAAGGTCCAT ATAAATAATA AATAATAATT GTGTGTGATA ACAACATTAT 180 

CCAAACAAAA CCAAACAAAA CGAAGGCAAA GTGGAGAAAA TGATACAGCA TCCAGAGTAC 240 

GGCCGTTATT CAGCTATCCA GAGCAAGTGT AGTGTGGCAA AATAGAAACA AACAAAGGCA 300 

i 

CCAAAATCTG CATACATGGG CTAATTAAGG CTGCCCAGCG AATTTACATT TGTGTGGTGC 360 

CAATCCAGAG TGAATCCGAA ACAAACTCCA TCTAGATCGC CAACCAGCAT CACGCTCGCA 420 

AACGCCCCCA GAATGTACAA A ATG TTT AGG AAA CAT TTT CGG CGA AAA CCA 471 

Met Pho Arg Lys His Phe Arg Arg Lys Pro 
15 10 

GCT ACG TCG TCG TCG TTG GAG TCA ACA ATA * TCA GCA GAC AGC CTG 519 
Ala Thr Ser Ser Ser Leu Glu Ser Thr He Glu Ser Ala Asp Ser Leu 
15 20 25 

GGA ATG TCC AAG AAG ACG GCG ACA AAA AGG CAG CGT CCG ACG CAT CGG 567 
Gly Met Ser Lys Lys Thr Ala Thr Lys Arg Gin Arg Pro Arg His Arg 
30 35 40 

GTA CCC AAA ATC GCG ACC CTG CCA TCG ACG ATC CGC GAT TGT CCA TCA 615 
Val Pro Lys He Ala Thr Leu Pro Ser Thr He Arg Asp Cys Arg Ser 
45 50 55 

TTA AAG TCT GCC TGC AAC TTA ATT GCT TTA ATT TTA ATA CTG TTA GTC 663 
Leu Lys Ser Ala Cys Asn Leu He Ala Leu He Leu He Leu Leu Val 
60 65 70 

CAT AAG ATA TCC GCA GCT GGT AAC TTC GAG CTG GAA ATA TTA GAA ATC 711 
His Lys He Ser Ala Ala Gly Asn Phe Glu Leu Glu He Leu Glu He 
75 80 85 90 

TCA AAT ACC AAC AGC CAT CTA CTC AAC GGC TAT TGC TGC GGC ATG CCA 759 
Ser Asn Thr Asn Ser His Leu Leu Asn Gly Tyr Cys CyB Gly Met Pro 
95 100 105 

GC5 GAA CTT AGG GCC ACC AAG ACG ATA GGC TGC TCG C*\ TGC ACG ACG 807 
Ala Glu Leu Arg Ala Thr Lys Thr He Gly Cys Ser Pro Cys Thr Thr 
110 115 120 

GCA TTC CGG CTG TGC CTG AAG GAG TAC CAG ACC ACG GAG CAG GGT GCC 855 
Ala Phe -Arg Leu Cys Leu Lys Glu Tyr Gin Thr Thr Glu Gin Gly Ala 
125 130 1"* 

AGC ATA TCC ACG CGC TGT TCG TTT GGC AAC GCC ACC ACC AAG ATA CTG ■ 903 
Ser He Ser Thr Gly Cys Ser Phe Gly Asn Ala Thr Thr Lys He Leu 
140 145 150 

GGT GGC TCC AGC TTT GTG CTC AGC GAT CCG GGT GTG GGA GCC ATT GTG 951 
Gly Gly Ser Ser Phe Val Leu Ser Asp Pro Gly Val Gly Ala He Val 
155 160 165 170 

CTG CCC TTT ACG TTT CGT TGG ACG AAG TCG TTT ACG CTG ATA CTG CAG 999 
Leu Pro Phe Thr Phe Arg Trp Thr Lys Ser Phe Thr Leu He Leu Gin 
175 180 185 



GCG TTG GAT ATG TAC AAC ACA TCC TAT CCA GAT GCG GAG AGG TTA ATT 1047 
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Ala Leu Asp Met Tyr Asn Thr Ser Tyr Pro Asp Ala Glu Arg Leu He 
190 195 200 

GAG GAA ACA TCA TAC TCG GGC GTG ATA CTG CCG TCG CCG GAG TGG AAG 1095 
Glu Glu Thr Ser Tyr Ser Gly Val He Leu Pro Ser Pro Glu Trp Lys 
205 210 215 

ACG CTG GAC CAC ATC GGG CGG AAC GCG CGG ATC ACC TAC CGT GTC CGG 1143 
Thr Leu Asp His He Gly Arg Asn Ala Arg He Thr Tyr Arg Val Arg 
^ 220 225 230 

GTG CAA TGC GCC GTT ACC TAC TAC AAC ACG ACC TGC ACG ACC TTC TGC 1191 
Val Gin Cys Ala Val Thr Tyr Tyr Asn Thr Thr Cys Thr Thr Phe Cys 
235 240 i 245 250 

CGT CCG CGG GAC GAT CAG TTC GGT CAC TAC GCC TGC GGC TCC-GAG GGT 1239 
Arg Pro Arg Asp Asp Gin Phe Gly His Tyr Ala Cys Gly Ser Glu Gly 
255 260 265 

CAG AAG CIC TGC CTG AAT GGC TGG CAG GGC GTC AAC TGC GAG GAG GCC 1287 
Gin Lys Leu Cys Leu Asn Gly Trp Gin Gly Val Asn Cys Glu Glu Ala 
270 275 280 

ATA TGC AAG GCG GGC TGC GAC CCC GTC CAC GGC 1320 
He Cys Lys Ala Gly Cys Asp Pro Val His Gly 
285 290 

(2) INFORMATION FOR SEQ ID NOi4i 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH* 293 amino acids 

(B) TYPE i amino acid 
(D) TOPOLOGY t unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Phe Arg Lys His Phe Arg Arg Lys Pro Ala Thr Ser Ser Ser Leu 
15 10 15 

Glu Ser Thr He Glu Ser Ala Asp Ser Leu Gly Met Ser Lys Lys Thr 
20 25 30 

Ala Thr Lys Arg Gin Arg Pro Arg His Arg Val Pro Lys He Ala Thr 
35 40 45 

Leu Pro Ser Thr He Arg Asp Cys Arg Ser Leu Lys Ser Ala Cys Asn 
50 55 60 

Leu He Ala Leu He Leu He Leu Leu Val His Lys He Ser Ala Ala 
65 70 75 80 

Gly Asn Phe Glu Leu Glu He Leu Glu lie Ser Asn Thr Asn Ser His ■ 
85 90 95 

Leu Leu Asn Gly Tyr Cys Cys Gly Met Pro Ala Glu Leu Arg Ala Thr 
100 105 110 

Lys Thr He Gly Cys Ser Pro Cys Thr Thr Ala Phe Arg Leu Cys Leu 
115 120 125 

Lys Glu Tyr Gin Thr Thr Glu Gin Gly Ala Ser He Ser Thr Gly Cys 
130 135 140 

Ser Phe Gly Asn Ala Thr Thr Lys He Leu Gly Gly Ser Ser Phe Val 
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145 150 155 160 

Leu Ser Asp Pro Gly Val Gly Ala He Val Leu Pro Phe Thr Phe Arg 
165 170 175 

Trp Thr Lys Ser Phe Thr Leu He Leu Gin Ala Leu Asp Met Tyr Asn 
180 185 190 

Thr Ser Tyr Pro Asp Ala Glu Arg Leu He Glu Glu Thr Ser Tyr Ser 
195 200 205 

Gly Val He Leu Pro Ser Pro Glu Trp Lys Thr Leu Asp His He Gly 
210 215 220 

Arg Asn Ala Arg He Thr' Tyr Arg Val Arg Val Gin Cys Ala Val Thr 
225 230 235 240 

Tyr Tyr Asn Thr Thr Cys Thr Thr Phe Cys Arg Pro Arg Asp Asp Gin 
245 250 255 

Phe Gly His Tyr Ala Cys Gly Ser Glu Gly Gin Lys Leu Cys Leu Asn 
260 265 270 

Gly Trp Gin Gly Val Asn Cys Glu Glu Ala He Cys Lys Ala Gly Cys 
275 280 285 

Asp Pro Val His Gly 
290 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS I 

(A) LENGTHS 267 base pairs 

(B) TYPE:, nucleic acid 

(C) STRANDED NESS t double 

(D) TOPOLOGY i unknown 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CGGTGGACTT CCTTCGTGTA TTGGTGGGAG CCCTCGGGAA CGGGGGGTAA CACTGAAAGG 60 

TCGAGTACCC ATTTCCGTCA TAACGGGTTG GTCGCCCCCT AGGGGTCGGA GTCAGGTGGA 120 

CGGGAGGTCG ACAACGCCCG GGGGACGGGT GGTACATGGT GTAAGGTCTT TACCCGACCG 180 

GGCAAACGGG TCACACCGAA AGGGGTGAAC GGTAACTACG GGGTCGTCCT GCCCGTCCAT 240 

CGAGTCTGGT AAGAGGGTCG CCTTAAG 267 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 574 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GAATTCCTTC CATTATACGT GACTTTTCTG AAACTGTAGC CACCCTAGTG TCTCTAACTC 60 

CCTCTGGAGT TTGTCAGCTT TGGTCTTTTC AAAGAGCAGG CTCTCTTCAA GCTCCTTAAT 120 

GCGGGCATGC TCCACTTTGG TCTGCGTCTC AAGATCACCT TTGGTAATTG ATTCTTCTTC 180 

AACCCGGAAC TGAAGGCTGG CTCTCACCCT CTAGGCAGAG CAGGAATTCC GAGGTGGATG 240 

TGTTAGATGT GAATGTCCGT GGCCCAGATG GCTGCACCCC ATTGATGTTG GCTTCTCTCC 300 

GAGGAGGCAG CTCAGATTTG AGTGATGAAG ATGAAGATGC AGAGGACTGT TCTGCTAACA 360 

TCATCACAGA CTTGGTCTAC CAGGGTGCCA GCCTCCAGNC CAGACAGACC GGACTGGTGA 420 

GATGGCCCTG CACCTTGCAG CCCGCTACTC ACGGGCTGAT GCTGCCAAGC GTCTCCTGGA 480 

TGCAGGTGCA GATGCCAATG CCCAGGACAA CATGGGCCGC TGTCCACTCC ATGCTGCAGT 540 

GGCACGTGAT GCCAAGGTGT ATTCAGATCT GTTA 574 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTIC.*^ : 

(A) LENGTH: 295 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

TCCAGATTCT GATTCGCAAC CGAGTAACTG ATCTAGATGC CAGGATGAAT GATGGTACTA 60 

CACCCCTGAT CCTGGCTGCC CGCCTGGCTG TGGAGGGAAT GGTGGCAGAA CTGATCAACT 120 

GCCAAGCGGA TGTGAATGCA GTGGATGACC ATGGAAAATC TGCTCTTCAC TGGGCAGCTG 180 

CTGTCAATAA TGTGGAGGCA ACTCTTTTGT TGTTGAAAAA TGGGGCCAAC CGAGACATGC 240 

AGGACAACAA GGAAGAGACA CCTCTGTTTC TTGCTGCCCG GGAGGAGCTA TAAGC 295 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 248 base pairs 
(8) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GAATTCCATT CAGGAGGAAA GGGTGGGGAG AGAAGCAGGC ACCCACTTTC CCGTGGCTGG 60 

ACTCGTTCCC AGGTGGCTCC ACCGGCAGCT GTGACCGCCG CAGGTGGGGG CGGAGTGCCA 120 

TTCAGAAAAT TCCAGAAAAG CCCTACCCCA ACTCGGACGG CAACGTCACA CCCGTGGGTA 180 
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GCAACTCGCA CACAAACAGC CAGCGTGTCT GGGGCACGGG GGGATGGCAC CCCCTGCAGG 240 
CAGAGCTC 248 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 323 base pairs 

(B) TYPE: nucleic acid 

( C ) STRAND EDNESS : . double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE:' cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
TACGTATCTC GAG C AC AG AC 
ACCAGTACGA ACATTTAGGC 
GAGCTACAGG TCCCGCTCGC 
GCGAACAAGA GGGCCAGATC 
CGGCCTTAAG GACGTCGGGC 
CGAGGNCGAA AACAAGGGAA 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3234 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 



<ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..3234 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

TGC CAG GAO GAC GCG GGC AAC 
Cys Gin Glu Asp Ala Gly Asn 
1 5 

CAC GCG TGC GGC TGG GAC GGC 
His Ala Cys Gly Trp Asp Gly 
20 

CCC TGG AAG AAC TGC ACG CAG 
Pro Trp Lys Asn Cys Thr Gin 
35 

GAC GGC CAC TGT GAC AGC CAG 
Asp Gly His Cys Asp Ser Gin 
50 55 

GGC TTT GAC TGC CAG CGT GCG 



AGCTGACGTA CACTTTTNNA GTGCGAGGGA CATTCGTCCG 60 

TCAGTACGGT AGGTCCATGG CCAAGACTAG GAGACGTACG 120 

TAAACTCGGA CCACTGAAAC CTCCGGTCGA CAGTCGGTAA 180 

TTAGAGAAGG TGTCGCGGCG AGACTCGGGC TCGGGTCAGG 240 

CCNNNAGGTG ATCAAGATCT CGNCNCGGCG GGCGCCACCT 300 

ATC 323 



AAG GTC TGC AGC 
Lys Val Cys Ser 
10 

GGT GAC TGC TCC 
Gly Asp Cys Ser 
25 

TCT CTG CAG TGC 
Ser Leu Gin Cys 
40 

TGC AAC TCA GCC 
Cys Asn S r Ala 



CTG CAG TGC AAC AAC 48 
Lou Gin Cys Asn Asn 
15 

CTC AAC TTC AAT GAC * 96 
Leu Asn Phe Asn Asp 
30 

TGG AAG TAG TTC AGT 144 
Trp Lys Tyr Phe Ser 
45 

GGC TGC CTC TTC GAC 192 
Gly Cys Leu Phe Asp 
60 



GAA GGC CAG TGC AAC CCC CTG TAC GAC 



240 
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Gly Phe Asp Cye Gin Arg Ala Glu Gly Gin Cys Asn Pro Leu Tyr Asp 
65 70 75 80 

CAG TAC TGC AAG GAC CAC TTC AGC GAC GGG CAC TGC GAC CAG GGC TGC 288 
Gin Tyr Cys Lys Asp His Phe Ser Asp Gly His Cys Asp Gin Gly Cys 
85 90 95 

AAC AGC GCG GAG TGC GAG TGG GAC GGG CTG GAC TGT GCG GAG CAT GTA 336 
Asn Ser Ala Glu Cys Glu Trp Asp Gly Leu Asp Cys Ala Glu His Val 
100 105 110 

CCC GAG AGG CTG GCG GCC GGC ACG CTG GTG GTG GTG GTG CTG ATG CCG 384 
Pro Glu Arg Leu Ala Ala Gly Thr Leu Val Val Val Val Leu Met Pro 
115 ■ 120 125 

CCG GAG CAG CTG CGC AAC AGC TCC TTC CAC TTC CTG CGG GAG CTC AGC 432 
Pro Glu Gin Leu Arg Asn Ser Ser Phe His Phe Leu Arg Glu Leu Ser 
130 135 140 

CGC GTG CTG CAC ACC AAC GTG GTC TTC AAG CGT GAC GCA CAC GGC CAG 480 
Arg Val Leu His Thr Asn Val Val Phe Lys Arg Asp Ala His Gly Gin 
145 150 155 160 

CAG ATG ATC TTC CCC TAC TAC GGC CGC GAG GAG GAG CTG CGC AAG CAC 528 
Gin Met He Phe Pro Tyr Tyr Gly Arg Glu Glu Glu Leu Arg Lys His 
165 170 175 

CCC ATC AAG CGT GCC GCC GAG GGC TGG GCC GCA CCT GAC GCC CTG CTG 576 
Pro lie Lys Arg Ala Ala Glu Gly Trp Ala Ala Pro Asp Ala Leu Leu 
180 185 190 

GGC CAG GTG AAG GCC TCG CTG CTC CCT GGT GGC AGC GAG GGT GGG CGG 624 
Gly Gin Val Lys Ala Ser Leu Leu Pro. Gly Gly Ser Glu Gly Gly Arg 
195 200 205 

CGG CGG AGG GAG CTG GAC CCC ATG GAC GTC CGC GGC TCC ATC GTC TAC 672 
Arg Arg Arg Glu Leu Asp Pro Met Asp Val Arg Gly Ser He Val Tyr 
210 . 215 220 

CTG GAG ATT GAC AAC CGG CAG TGT GTG CAG GCC TCC TCG CAG TGC TTC 720 
Leu Glu He Asp Asn Arg Gin Cys Val Gin Ala Ser Ser Gin Cye Phe 
225 230 235 240 

CAG AGT GCC ACC GAC GTG GCC GCA TTC CTG GGA GCG CTC GCC TCG CTG 768 
Gin Ser Ala Thr Asp Val Ala Ala Phe Leu Gly Ala Leu Ala Ser . Leu 
245 250 255 

GGC AGC CTC AAC ATC CCC TAC AAG ATC GAG GCC GTG CAG AGT GAG ACC 816 
Gly Ser Leu Asn 'lie Pro Tyr Lys He Glu Ala Val Gin Ser Glu Thr 
260 265 270 

GTG GAG CCG CCC CCG CCG GCG CAG CTG CAC TTC ATG TAC GTG GCG GCG 864 
Val Glu Pro Pro Pro t o Ala Gin Leu His Phe Met Tyr Val Ala Ala 
275 280 285 

GCC GCC TTT GTG CTT CTG TTC TTC GTG GGC TGC GGG GTG CTG CTG TCC 912 
Ala Ala Phe Val Leu Leu Phe Phe Val Gly Cys Gly Val Leu Leu Ser 
290 295 300 

CGC AAG CGC CGG CGG CAG CAT GGC CAG CTC TGG TTC CCT GAG GGC TTC 960 
Ara Lys Arg Arg Arg Gin His Gly Gin Leu Trp Phe Pro Glu Gly Phe 
305 310 315 320 

AAA GTG TCT GAG GCC AGC AAG AAG AAG CGG CGG GAG CCC CTC GGC GAG 1008 
Lys Val Ser Glu Ala Ser Lys Lys Lys Arg Arg Glu Pro Leu Gly Glu 
325 330 335 
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GAC TCC GTG GGC CTC AAG CCC CTG AAG AAC GCT TCA GAG GGT GCC CTC 1056 
a«o Ser Val Gly Leu Lys Pro Leu Lys Asn Ala Ser Asp Gly Ala Leu 
P 340 345 350 

ATG GAC GAC AAC CAG AAT GAG TGG GGG GAC GAG GAC CTG GAG ACC AAG 1104 
Met Asp Asp Asn Gin Asn Glu Trp Gly Asp Glu Asp Leu Glu Thr Lys 
355 360 365 

AAG TTC CGG TTC GAG GAG CCC GTG GTT CTG CCT GAC CTG GAC GAC CAG 1152 
Lys Phe Arg Phe Glu Glu Pro Val Val Leu Pro Asp Leu Asp Asp Gin 
370 375 380 

ACA GAC CAC CGG CAG TGG* ACT CAG CAG CAC CTG GAT GCC GCT GAC CTG 1200 
Thr Asp His Arg Gin Trp Thr Gin Gin His Leu Asp Ala Ala Asp Leu 
385 390 395 400 

CGC ATG TCT GCC ATG GCC CCC ACA CCG CCC CAG GGT GAG GTT GAC GCC 1248 
Ara Met Ser Ala Met Ala Pro Thr Pro Pro Gin Gly Glu Val Asp Ala 
y 405 410 415 

GAC TGC ATG GAC GTC AAT GTC CGC CGG CCT GAT GGC TTC ACC CCG CTC 1296 
Asp Cys Met Asp Val Asn Val Arg C); ?ro Asp Gly Phe Thr Pro Leu 
* 420 <"5 430 

ATG ATC GCC TCC TGC AGC GGG GGC GGC CTG GAG ACG GGC AAC AGC GAG 1344 
Met lie Ala Ser Cys Ser Gly Gly Gly Leu Glu Thr Gly Asn Ser Glu 
435 440 445 

GAA GAG GAG GAC GCG CCG GCC GTC ATC TCC GAC TTC ATC TAC CAG GGC 1392 
Glu Glu Glu Asp Ala Pro Ala Val He Ser Asp Phe He Tyr Gin Gly 
450 455 460 

GCC AGC CTG CAC AAC CAG ACA GAC CGC ACG GGC GAG ACC GCC TTG CAC 1440 
Ala Ser Leu His Asn Gin Thr Asp Arg Thr Gly Glu Thr Ala Leu His 

470 475 480 



465 



CTG GCC GCC CGC TAC TCA CGC TCT GAT GCC CCC AAG CGC CTG CTG GAC 1488 
Leu Ala Ala Arg Tyr Ser Arg Ser Asp Ala Ala Lys Arg Leu Leu Glu 
485 490 495 

GCC AGC GCA GAT GCC AAC ATC CAG GAC AAC ATG GGC CGC ACC CCG CTG 1536 
Ala Ser Ala Asp Ala Asn He Gin Asp Asn Met Gly Arg Thr Pro Leu 
500 505 510 

CAT GCG GCT GTG TCT GCC GAC GCA CAA GGT GTC TTC CAG ATC CTG ATC 1584 
His Ala Ala Val Ser Ala Asp Ala Gin Gly Val Phe Gin He Leu lie 
515 520 525 

CGG AAC CGA GCC ACA GAC CTG GAT GCC CGC ATG CAT GAT GGC ACG ACG 1632 
Arg Asn Arg Ala Thr Asp Leu Asp Ala Arg Met His Asp Gly Thr Thr 
536 535 540 

CCA CTG ATC CTG GCT GCC CGC CTG GCC GTG GAG GGC ATG CTG GAG GAC 1680 
Pro Leu He Leu Ala Ala Arg Leu Ala Val Glu Gly Met Leu Glu Asp 
545 550 555 560- 

CTC ATC AAC TCA CAC GCC GAC GTC AAC GCC GTA GAT GAC CTG GGC AAG 1728 
Leu He Asn Ser His Ala Asp Val Asn Ala Val Asp Asp Leu Gly Lys 
565 570 575 

TCC GCC CTG CAC TGG GCC GCC GCC GTG AAC AAT GTG GAT GCC GCA GTT 1776 
Ser Ala Leu His Trp Ala Ala Ala Val Asn Asn Val Asp Ala Ala Val 
580 585 590 

GTG CTC CTG AAG AAC GGG GCT AAC AAA GAT ATG CAG AAC AAC AGG GAG 1824 
Val Leu Leu Lys Asn Gly Ala Asn Lys Asp M t Gin Asn Asn Arg Glu 
595 600 605 
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GAG ACA CCC CTG TTT CTG GCC GCC CGG GAG GGC AGC TAC GAG ACC GCC 1872 
Glu Thr Pro Leu Ph Lou Ala Ala Arg Glu Gly S r Tyr Glu Thr Ala 
610 615 620 

AAG GTG CTG CTG GAC CAC TTT GCC AAC CGG GAC ATC ACG GAT CAT ATG 1920 
Lyo Val Lou Leu A9p His Phe Ala Aan Arg Asp lie Thr Asp His Met 
625 630 635 640 

GAC CGC CTG CCG CGC GAC ATC GCA CAG GAG CGC ATG CAT CAC GAC ATC 1968 
Asp Arg Leu Pro Arg Asp He Ala Gin Glu Arg Met His His Asp He 
645 650 655 

GTG AGG CTG CTG GAC GAG TAC AAC CTG GTG CGC AGC CCG CAG CTG CAC 2016 
Val Arg Leu Leu Asp Glu' Tyr Asn Leu Val Arg Ser Pro Gin Leu His 
660 665 670 

GGA GCC CCG CTG GGG GGC ACG CCC ACC CTG TCG CCC CCG CTC TGC TCG 2064 
Gly Ala Pro Leu Gly Gly Thr Pro Thr Leu Ser tf Pro Pro Leu Cys Ser 
675 680 685 

CCC AAC GGC TAC CTG GGC AGC CTC AAG CCC GGC GTG CAG GGC AAG AAG 2112 
Pro Asn Gly Tyr Leu Gly Ser Leu Lys Pro Gly Val Gin Gly Lys Lys 
690 695 700 

GTC CGC AAG CCC AGC AGC AAA GGC CTG GCC TGT GGA AGC AAG GAG GCC 2160 
Val Arg Lys Pro Ser Ser Lys Gly Leu Ala Cys Gly Ser Lys Glu Ala 
705 710 715 720 

AAG GAC CTC AAG GCA CGG AGG AAG AAG TCC CAG GAT GGC AAG GGC TGC 2208 
Lys Asp Leu Lys Ala Arg Arg Lys Lys Ser Gin Asp Gly Lys Gly Cys 
725 730 735 

CTG CTG GAC AGC TCC GGC ATG CTC TCG CCC GTG GAC TCC CTG GAG TCA 2256 
Leu Leu Asp Ser Ser Gly Met Leu Ser Pro Val Asp Ser Leu Glu Ser 
740 745 750 

CCC CAT CGC TAC CTG TCA GAC GTG GCC TCG CCG CCA CTG CTG CCC TCC 2304 
Pro His Gly Tyr Leu ServAsp Val Ala Ser Pro Pro Leu Leu Pro Ser 
755 760 765 

CCG TTC CAG CAG TCT CCG TCC GTG CCC CTC AAC CAC CTG CCT GGG ATG 2352 
Pro Phe Glri Gin Ser Pro Ser Val Pro Leu Asn His Leu Pro Gly Met 
770 775 780 

CCC GAC ACC CAC CTG GGC ATC GGG CAC CTG AAC GTG GCG GCC AAG CCC 2400 
Pro Asp Thr His Leu Gly He Gly His Leu Asn Val Ala Ala Lys Pro 
785 . 790 795 800 

GAG ATG GCG GCG CTG GGT GGG GGC GGC CGG CTG GCC TTT GAG ACT GGC 2448 
Glu Met Ala Ala Leu Gly Gly Gly Gly Arg Leu Ala Phe Glu Thr Gly 
805 810 815 

CCA CCT CGT CTC TCC CAC CTG CCT GTG GCC TCT GGC ACC AGC ACC GTC 2496 
Pro Pro Arg Leu Ser His Leu Pro Val Ala Ser Gly Thr Ser Thr Val 
820 825 830 

CTG GGC TCC AGC AGC GGA GGG GCC CTG AAT TTC ACT GTG GGC GGG TCC 2544 
Leu Gly Ser Ser Ser Gly Gly Ala Leu Asn Phe Thr Val Gly Gly Ser 
835 840 845 

ACC AGT TTG AAT GGT CAA TCC GAC TGG CTG TCC CGG CTG CAG AGC GGC 2592 
Thr Ser Leu Asn Gly Gin Cys Glu Trp Leu Ser Arg Leu Gin Ser Gly 
850 855 860 

ATG GTG CCG AAC CAA TAC AAC CCT CTG CGG GGG AGT GTG GCA CCA GGC 2640 
Met Val Pro Asn Gin Tyr Asn Pro Leu Arg Gly S r Val Ala Pro Gly 
865 870 875 .880 
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CCC CTG AGC ACA CAG GCC CCC TCC CTG CAG CAT GGC ATG GTA GGC CCG 2688 
Pro Leu Ser Thr Gin Ala Pro Ser Leu Gin His Gly Met Val Gly Pro 
885 890 895 

CTG CAC AGT AGC CTT GCT GCC AGC GCC CTG TCC CAG ATG ATG AGC TAC 2736 
Leu His ser Ser Leu Ala Ala Ser Ala Leu Ser Gin Met Met Ser Tyr 
900 905 910 

CAG GGC CTG CCC AGC ACC CGG CTG GCC ACC CAG CCT CAC CTG GTG CAG 2784 
Gin Gly Leu Pro Ser Thr Arg Leu Ala Thr Gin Pro His Leu Val Gin 
915 920 925 

ACC CAG CAG GTG CAG CCA'CAA AAC TTA CAG ATG CAG CAG CAG AAC CTG 2832 
Thr Gin Gin Val Gin Pro Gin Asn Leu Gin Met Gin Gin Gin Asn Leu 
930 935 940 

CAG CCA GCA AAC ATC CAG CAG CAG CAA AGC CTG CAG CCG CCA CCA CCA 2880 
Gin Pro Ala Asn He Gin Gin Gin Gin Ser Leu Gin Pro Pro Pro Pro 
945 950 955 960 

CCA CCA CAG CCG CAC CTT GGC GTG AGC TCA GCA GCC AGC GGC CAC CTG 2928 
Pro Pro Gin Pro His Leu Gly Val Ser Ser Ala Ala Ser Gly His Leu 
965 970 975 

GGC CGG AGC TTC CTG AGT GGA GAG ' CG AGC CAG GCA GAC GTG CAG CCA 2976 
Gly Arg Ser Phe Leu Ser Gly Glu Pro Ser Gin Ala Asp Val Gin Pro 
980 985 990 

CTG GGC CCC AGC AGC CTG GCG GTG CAC ACT ATT CTG CCC CAG GAG AGC 3024 
Leu Gly Pro Ser Ser Leu Ala Val His Thr He Leu Pro Gin Glu Ser 
995 1000 1005 

CCC GCC CTG CCC ACG TCG CTG CCA TCC TCG CTG GTC CCA CCC GTG ACC 3072 
Pro Ala Leu Pro Thr Ser Leu Pro Ser Ser Leu Val Pro Pro Val Thr 
10X0 1015 1020 

GCA GCC CAG TTC CTG ACG CCC CCC TCG CAG CAC AGC TAC TCC TCG CCT 3120 
Ala Ala Gin Phe Leu Thr Pro Pro Ser Gin His Ser Tyr Ser Ser Pro 
1025 1030 1035 1040 

GTG GAC AAC ACC CCC AGC CAC CAG CTA CAG GTG CCT GTT CCT GTA ATG 3168 
Val Asp Asn Thr Pro Ser His Gin Leu Gin Val Pro Val Pro Val Met 
1045 1050 1055 

GTA ATG ATC CGA TCT TCG GAT CCT TCT AAA GGC TCA TCA ATT TTG ATC 3216 
Val Met He Arg Ser Ser Asp Pro Ser Lys Gly Ser Ser He Leu He 
1060 1065 1070 

GAA GCT CCC GAC TCA TGG 3234 
Glu Ala Pro Asp Ser Trp 
- 1075 

(2) INFORMATION FOR SEQ ID NO: 111 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1078 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 1 

Cys Gin Glu Asp Ala Gly Asn Lys Val Cys Ser Leu Gin Cys Asn Asn 
1 5 10 15 



WO 94/07474 PCI7US93/09338 

-105- 

His Ala Cya Gly Trp Asp Gly Gly Asp Cys Ser Lau Asn Phe Aan Asp 
20 25 30 

Pro Trp Lys Asn Cya Thr Gin Ser Leu Gin Cys Trp Lys Tyr Phe Ser 

35 40 
Asp Gly His Cys Asp Ser Gin Cya Asn Ser Ala Gly Cys Leu Phe Asp 

Gly Phe Asp Cya Gin Arg Ala Glu Gly Gin Cys Asn Pro Leu Tyr Asp 
65 70 75 

Gin Tyr Cys Lya Asp His Phe Ser Aap Gly His Cys Asp Gin Gly Cya 

85 ^0 " 

Asn Ser Ala Glu Cys Glu Trp Aap Gly Leu Asp Cys Ala Glu His Val 
100 105 1 

Pro Glu Arg Leu Ala Ala Gly Thr Leu Val Val Val Val Leu Met Pro 
US 120 125 

Pro Glu Gin Leu Arg Asn Ser Ser Phe His Phe Leu Arg Glu Leu Ser 
130 135 

Arg Val Leu His Thr Aan Val Val Phe Lys Arg Asp Ala His Gly Gin 
145 150 155 

Gin Met lie Phe Pro Tyr Tyr Gly Arg Glu Glu Glu Leu Arg Lys His 
165 I 70 1/3 

Pro lie Lya Arg Ala Ala Glu Gly Trp Ala Ala Pro Asp Ala Leu Leu 
180 l 85 

Gly Gin Val Lya Ala Ser Leu Leu Pro Gly Gly Ser Glu Gly Gly Arg 
' 195 200 205 

Arg Arg Arg Glu Leu Aap Pro Met Asp Val Arg Gly Ser lie Val Tyr 
210 215 " 

Leu Glu lie Asp Asn Arg Gin Cys Val Gin Ala Ser Ser Gin Cys Phe 
225 230 235 

Gin Ser Ala Thr Asp Val Ala Ala Phe Leu Gly Ala Leu Ala Ser Leu 

245 250 " 3 

Gly Ser Leu Asn lie Pro Tyr Lys lie Glu Ala Val Gin Ser Glu Thr 

260 265 
Val Glu Pro Pro Pro Pro Ala Gin Leu His Phe Met Tyr Val Ala Ala 

275 280 
Ala Ala Phe Val Leu Leu Phe Phe Val Gly Cys Gly Val Leu Leu Ser 

290 295 
Arg Lya Arg Arg Arg Gin His Gly Gin Leu Trp Phe Pro Glu Gly Phe 

Lya Val Ser Glu Ala Ser Lys Lys Lya Arg Arg Glu Pro Leu Gly Glu 



325 



Aap Ser Val Gly Leu Lys Pro Leu Lys Asn Ala Ser Asp Gly Ala Leu 

340 345 
Met Asp Aap Asn Gin Asn Glu Trp Gly Asp Glu Asp Leu Glu Thr Lya 

355 360 
Lya Phe Arg Phe Glu Glu Pro Val Val Leu Pr Asp Leu Asp Asp Gin 



# 
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370 375 380 

Thr Asp His Arg Gin Trp Thr Gin Gin His Leu Asp Ala Ala Asp Leu 
385 390 

Arg Met Ser Ala Met Ala Pro Thr Pro Pro Gin Gly Glu Val Asp Ala 

* 405 410 

Asp eye Met Asp Val Asn Val Arg Gly Pro Asp Gly Phe Thr Pro Leu 
. 420 425 

Met lie Ala Ser Cys Ser Gly Gly Gly Leu Glu Thr Gly Asn Ser Glu 

435 " 440 " 5 

Glu Glu Glu Aap Ala Pro Ala Val lie Ser Asp ?he lie Tyr Gin Gly 
450 «5 460 

Ala Ser Leu Hie Asn Gin Thr Asp Arg Thr Gly Glu Thr Ala Leu His 
465 470 ' 

Leu Ala Ala Arg Tyr Ser Arg Ser Asp Ala Ala Lys Arg Leu Leu Glu 
485 490 " 

Ala Ser Ala Asp Ala Asn lie Gin Asp Asn Met Gly Arg Thr Pro Leu 
500 505 

His Ala Ala Val Ser Ala Asp Ala Gin Gly Val Phe Gin He Leu He 
515 520 525 

Arg Asn Arg Ala Thr Asp Leu Asp Ala Arg Met His Asp Gly Thr Thr 
530 535 54 

Pro Leu lie Leu Ala Ala Arg Leu Ala Val Glu Gly Met Leu Glu Asp 
545 550 555 5*0 

Leu He Asn Ser His Ala Asp Val Asn Ala Val Asp Asp Leu Gly Lys 
565 570 »'= 

Ser Ala Leu His Trp Ala Ala Ala Val Asn Asn Val Asp Ala Ala Val 
580 585 590 

Val Leu Leu Lys Asn Gly Ala Asn Lys Asp Met Gin Asn Asn Arg Glu 
595 600 605 

Glu Thr Pro Leu Phe Leu Ala Ala Arg Glu Gly Ser Tyr Glu Thr Ala 
610 615 620 

Lys val Leu Leu Asp His Phe Ala Asn Arg Asp lie Thr Asp His Met 
625 630 635 640 

Asp Arg Leu Pro Arg Asp He Ala Gin Glu Arg Met His His Asp lie 

* g45 650 033 

Val Arg Leu Leu Asp Glu Tyr Asn Leu Val Arg Ser Pro Gin Leu His 
660 665 570 

Gly Ala Pro Leu Gly Gly Thr Pro Thr Leu Ser Pro Pro Leu Cys Ser 

1 675 680 68 

Pro Asn Gly Tyr Leu Gly Ser Leu Lys Pro Gly Val Gin Gly Lys Lys 

690 695 70 

Val Arg Lys Pro Ser Ser Lys Gly Leu Ala Cys Gly Ser Lys Glu Ala 
70S 710 

Lye Asp Leu Lys Ala Arg Arg Lys Lys Ser Gin Asp Gly Lys Gly Cys 
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Le u Leu Asp Ser Ser oly Met Leu Ser Pro Val Asp ser Leu Glu Ser 
740 745 

Pro His Gly Tyr Leu Ser Asp Val Ala Ser Pro Pro Leu Leu Pro Ser 
755 760 765 

Pro Phe Gin Gin Ser Pro Ser Val Pro Leu Asn His Leu Pro Gly Met 
770 775 780 

Pro Asp Thr His Leu Gly lie Gly His Leu Asn Val Ala Ala Lys Pro 
785 790 795 

Glu Met Ala Ala Leu Gly Gly Oly Gly Arg Leu Ala Phe Glu Thr Gly 

Pro Pro Arg Leu Ser His Leu Pro Val Ala Ser Gly Thr Ser .rhr Val 

Leu Gly Ser Ser Ser Gly Gly Ala Leu Asn Phe Thr Val Gly Gly Ser 
835 8 40 845 

Thr Ser Leu Asn Gly Gin Cys Glu Trp Leu Ser Arg Leu Gin Ser Gly 
850 855 860 

Met Val Pro Asn Gin Tyr Asn Pro Leu Arg Gly Ser Val Ala Pro Gly 
865 870 875 

Pro Leu Ser Thr Gin Ala Pro Ser Leu. Gin His Gly Met Val Gly Pro 
885 890 8SS 

Leu Hie Ser Ser Leu Ala Ala Ser Ala Leu Ser Gin Met Met Ser Tyr 

900 905 9 

Gin Gly Leu Pro Ser Thr Arg Leu Ala Thr Gin Pro His Leu Val Gin 
915 920 925 

Thr Gin Gin Val Gin Pro Gin Asn Leu Gin Met Gin Gin Gin Asn Leu 
930 935 9 4 <> 

Gin Pro Ala Asn lie Gin Gin Gin Gin Ser Leu Gin Pro Pro Pro Pro 
945 950 9 55 * ow 

Pro Pro Gin Pro His Leu Gly Val Ser Ser Ala Ala Ser Gly His Leu 

965 970 

Gly Arg Ser Phe Leu Ser Gly Glu Pro Ser Gin Ala Asp Val Gin Pro 

980 985 

Leu Gly Pro Ser Ser Leu Ala Val^His Thr lie Leu Pro^Gln Glu Ser 



995 



Pro Ala Leu Pro Thr Ser Leu Pro Ser Ser Leu Val Pro Pro Val Thr 
1010 1015 1020 

Alalia Gin Phe Leu TtePro Pro Ser Gin Hi^Ser Tyr Ser Ser PrO Q 

Val Asp Asn Thr Proper His Gin Leu oi»V.l Pro Val Pro Val^et 

val Met lie Arg Ser Ser Asp Pro Ser Lys Gly Ser Ser lie Leu lie 
1060 1065 

Glu Ala Pro Asp Ser Trp 
1075 

(2) INFORMATION FOR SEQ ID NO: 12: 
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(L\ SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4268 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY 2 unknown 

(ii) MOLECULE TYPE: CDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 1972 

t 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

G GAG GTG GAT GTG TTA GAT GTG AAT GTC CGT GGC CCA GAT GCC TGC 
III S3 Asp Val Leu Asp Val Asn Val Arg Gly Pro Asp Gly Cys 

ACC CCA TTG ATG TTG GCT TCT CTC CGA GGA GGC AGC TCA GAT TTG AGT 
JS Pro Leu Met Leu Ala Ser Leu Arg Gly Gly Ser Ser Asp Leu Ser 



20 



r*r GAA GAT GAA GAT GCA GAG GAC TCT TCT GCT AAC ATC ATC ACA GAC 
A» Sii As? S£ Asp Ala Glu Asp Ser Ser Ala Asn lie lie Thr Asp 

-- 40 * s 



35 



SO 



80 



GGC CGC TGT CCA CTC CAT CCT GCA GTG GCA GCT GAT GCC CAA CGT GTC 
SJ Sg Cys Pro Leu His Ala Ala Val Ala Ala Asp Ala Gin Gly Val 

TTC CAG ATT CTG ATT CGC AAC CGA GTA ACT GAT CTA GAT GCC AGG ATG 
III S£ Ue Leu He Arg Asn Arg Val Thr Asp Leu Asp Ala Arg Met 

115 120 *" 

rGT ACT A cA CCC CTG ATC CTG GCT GCC CGC CTC GCT GTG GAG 
Jsn a£ Gly S2 S S Leu lie Leu Ala Ala Arg Leu Ala Val Glu 

GGA ATG GTG GCA GAA CTC ATC AAC TGC CAA GCG GAT GTG AAT GCA GTG 
SJ kII Sal Ala Glu Leu lie Asn Cys Gin Ala Asp Val Asn Ala Val 
* 14 5 150 1" 

GAT GAC CAT GGA AAA TCT GCT CTT CAC TGG GCA GCT GCT GTC AAT AAT 
Asp As"p S?s S} iyl ser Ala Leu His Trp Ala Ala Ala Val Asn Asn 



160 



GTG GAG GCA ACT CTT TTG TTG TTG AAA AAT GGG GCC AAC CGA CAC ATG 
5a? til A?2 Ihr Leu Leu Leu Leu Lys Asn Gly Ala Asn Arg Asp Met 
180 185 

CAG GAC AAC AAG GAA GAG ACA CCT CTG TTT CTT GCT GCC CGG GAG GGG 
oTn Sp £sS ™ Glu Glu Thr Pro Leu Phe Leu Ala Ala Arg Glu Gly 
ig5 200 



46 



94 



142 



TTr CTC TAC CAG GGT GCC AGC CTC CAG GCC CAG ACA GAC CGG ACT GGT 190 
S SS ™r gII Oil Ala Ser Leu Gin Ala Gin Thr Asp Arg Thr Gly 



GAG ATG GCC CTG CAC CTT CCA GCC CGC TAC TCA CGG GCT GAT GCT GCC 238 
Glu Met Ala Leu His Leu Ala Ala. Arg Tyr Ser Arg Ala Asp Ala Ala 
65 70 75 

.. P CGT CTC CTG GAT GCA GGT GCA GAT GCC AAT GCC CAC GAC AAC ATG 286 
£yt A?g III HI Asp Ala Gly Ala Asp Ala Asn Ala Gin Asp Asn Met 

- - 85 90 



334 



382 



430 



478 



526 



574 



622 
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AGC TAT GAA OCA GCC AAG ATC CTG TTA GAC CAT TTT GCC AAT CGA GAG 670 
Ser Tyr Glu Ala Ala Lya lie Leu Leu Asp His Phe Ala Asn Arg Asp 
210 215 220 

ATC ACA GAC CAT ATG GAT CGT CTT CCC CGG GAT GTG GCT CGG GAT CGC 718 
II© Thr Asp His Met Asp Arg L u Pro Arg Asp Val Ala Arg Asp Arg 
225 230 235 

ATG CAC CAT GAC ATT GTG CGC CTT CTG GAT GAA TAC AAT GTG ACC CCA 766 
Met His His Asp lie Val Arg Leu Leu Asp Glu Tyr Asn Val Thr Pro 
240 245 250 255 

AGC CCT CCA GGC ACC GTC? TTG ACT TCT GCT CTC TCA CCT GTC ATC TGT 814 
Ser Pro Pro Gly Thr Val Leu Thr Ser Ala Leu Ser Pro Val lie Cys 
260 265 270 

GGG CCC AAC AGA TCT TTC CTC AGC CTG AAG CAC ACC CCA ATG GGC AAG 862 
Glv Pro Aon Arg Ser Phe Leu Ser Leu Lys His Thr Pro Met Gly Lys 
1 275 280 285 

AAG TCT AGA CGG CCC AGT GCC AAG AGT ACC ATG CCT ACT AGC CTC CCT 910 
Lva Ser Arg Arg Pro ser Ala Lys Ser Thr Met Pro Thr Ser Leu Pro 
1 290 295 300 

AAC CTT GCC AAG GAG GCA AAG GAT ^CC AAG GGT AGT AGG AGG AAG AAG 958 
Asn Leu Ala Lys Glu Ala Lys Asp Ala Lys Gly Ser Arg Arg Lys Lys 
305 310 315 

TCT CTG AGT GAG AAG GTC CAA CTG TCT GAG AGT TCA GTA ACT TTA TCC 1006 
Ser Leu Ser Glu Lys Val Gin Leu Ser Glu Ser Ser Val Thr Leu Ser 
320 325 330 335 

CCT GTT GAT TCC CTA GAA TCT CCT CAC ACG TAT GTT TCC GAC ACC ACA 1054 
Pro Val Asp Ser Leu Glu Ser Pro His Thr Tyr Val Ser Asp Thr Thr 
340 345 350 

TCC TCT CCA ATG ATT ACA TCC CCT GGG ATC TTA CAG GCC TCA CCC AAC 1102 
Ser Ser Pro Met He Thr Ser Pro Gly He Leu Gin Ala Ser Pro Asn 
355 360 365 

CCT ATG TTG GCC ACT GCC GCC CCT CCT GCC CCA GTC CAT GCC CAG CAT 1150 
Pro Met Leu Ala Thr Ala Ala Pro Pro Ala Pro Val His Ala Gin His 
370 375 380 

GCA CTA TCT TTT TCT AAC CTT CAT GAA ATG CAG CCT TTG GCA CAT GGG 1198 
Ala Leu Ser Phe Ser Asn Leu His Glu Met Gin Pro Leu Ala His Gly 
385 390 395 

GCC ACC ACT GTG CTT CCC TCA GTG AGC CAG TTG CTA TCC CAC CAC CAC 1246 
Ala Ser Thr Val Leu Pro Ser Val Ser Gin Leu Leu Ser His His His 
400 405 410 415 

ATT GTG TCT CCA GGC A"T GGC AGT GCT GGA AGC TTG AGT AGG CTC CAT 1294 
He Val Ser Pro Gly Ser Gly Ser Ala Gly Ser Leu Ser Arg Leu His 
420 425 430 

CCA GTC CCA GTC CCA GCA GAT TCG ATG AAC CGC ATG GAG GTG AAT GAG 1342 
Pro Val Pro Val Pro Ala Asp Trp Met Asn Arg Met Glu Val Asn Glu 
435 440 445 

ACC CAG TAC AAT GAG ATG TTT GGT ATG CTC CTG GCT CCA CCT GAG GGC 1390 
Thr Gin Tyr Asn Glu Met Phe Gly Met Val Leu Ala Pro Ala Glu Gly 
450 455 460 

ACC CAT CCT GGC ATA GCT CCC CAG AGC AGG CCA CCT GAA GGG AAG CAC 1438 
Thr His Pro Gly He Ala Pro Gin Ser Arg Pro Pro Clu Gly Lya His 
455 470 475 
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ATA ACC ACC CCT CGG GAG CCC TTG CCC CCC ATT GTG ACT TTC CAG CTC I486 

lie Thr Thr Pro Arg Glu Pro Leu Pro Pro lie Val Thr Phe Gin. Leu 
480 485 490 495 



ATC CCT AAA GGC AGT ATT GCC CAA CCA GCG GGG CCT CCC CAG CCT CAG 1S34 
lie Pro Lys Gly Ser He Ala Gin Pro Ala Gly Ala Pro Gin Pro Gin 
500 505 510 

TCC ACC TGC CCT CCA GCT GTT GCG GGC CCC CTG CCC ACC ATG TAC CAG 1582 
Ser Thr Cys Pro Pro Ala Val Ala Gly Pro Leu Pro Thr Met Tyr Gin 
515 520 525 

ATT CCA GAA ATG GCC CGT <TTG CCC AGT GTG GCT TTC CCC ACT GCC ATG 1630 
S Pro" G?S Met Ala Arg Leu Pro Ser Val Ala Phe Pro Thr Ala Met 
530 535 540 

ATG CCC CAG CAG GAC GGG CAG GTA GCT CAG ACC ATT CTC CCA GCC TAT 1678 
Met Pro Gin Gin Asp Gly Gin Val Ala Gin Thr lie Leu Pro Ala Tyr 
545 550 555 

CAT CCT TTC CCA GCC TCT GTG GGC AAG TAC CCC ACA CCC CCT TCA CAG 1726 
His Pro Phe Pro Ala ser Val Gly Lys Tyr Pro Thr Pro Pro Ser Gin 
560 565 570 575 

CAC AGT TAT GCT TCC TCA AAT GCT GCT GAG CGA ACA CCC AGT CAC AGT 1774 
His Ser Tyr Ala Ser Ser Asn Ala Ala Glu Arg Thr Pro Ser His Ser 
580 585 590 

GCT CAC CTC CAG GOT GAG CAT CCC TAC CTG ACA CCA TCC CCA GAG TCT 1822 
Gly His Leu Gin Gly Glu His Pro Tyr Leu Thr Pro Ser Pro Glu Ser 
595 600 605 

CCT GAC CAG TGG TCA AGT TCA TCA CCC CAC TCT GCT TCT GAC TGG TCA 1870 
Pro Asp Gin Trp Ser Ser Ser Ser Pro His Ser Ala Ser Asp Trp Ser 
610 615 620 

GAT GTG ACC ACC AGC CCT ACC CCT GGG GGT GCT GGA GGA GGT CAG CGG 1918 
Asp Val Thr Thr Ser Pro .?hr Pro Gly Gly Ala Gly Gly Gly Gin Arg 
625 630 635 

CGA CCT GGG ACA CAC ATG TCT GAG CCA CCA CAC AAC AAC ATG CAG GTT 1966 
Glv Pro Gly Thr His Met Ser Glu Pro Pro His Asn Asn Met Gin Val 
640 645 650 655 

TAT GCG TGAGAGAGTC CACCTCCAGT GTAGAGACAT AACTGACTTT TGTAAATGCT 2022 
Tyr Ala 



GCTGAGGAAC 


AAATGAAGGT 


CATCCGGGAG 


AGAAATGAAG 


AAATCTCTGG 


AGCCAGCTTC 


2022 


TAGAGGTAGG 


AAAGAGAAGA 


TGTTCTTATT 


CAGATAATGC 


AAGAGAAGCA 


ATTCGTCAGT 


2142 


TTCACTGGGT 


ATCTGCAAGG 


CTTATTGATT 


ATTCTAATCT 


. JVTAAG ACAA 


GTTTGTGGAA 


2202 


ATG C AAG ATG 


AAT AC AAG CC 


TTGGGTCCAT 


GTTTACTCTC 


TTCTATTTGG 


AGAATAAGAT 


2262 


GGATGCTTAT 


TGAAGCCCAG 


ACATTCTTGC 


AGCTTGGACT 


GCATTTTAAG 


CCCTGCAGGC 


2322 


TTCTGCCATA 


TCCATGAGAA 


GATTCTACAC 


TAGCGTCCTG 


TTGGGAATTA 


TGCCCTGGAA 


2382 


TTCTGCCTGA 


ATTGACCTAC 


GCATCTCCTC 


CTCCTTGGAC 


ATTCTTTTGT 


CTTCATTTGG 


2442 


TGCTTTTGGT 


TTTGCACCTC 


TCCGTCATTG 


TAGCCCTACC 


AGCATGTTAT 


AGGGCAAGAC 


2502 


CTTTGTGCTT 


TTGATCATTC 


TGGCCCATGA 


AAGCAACTTT 


GGTCTCCTTT 


CCCCTCCTGT 


2562 


CTTCCCGGTA 


TCCCTTGGAG 


TCTCACAAGG 


TTTACTTTGG 


TATGGTTCTC 


AGCACAAACC 


2622 
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TTTCAAGTAT GTTCTTTCTT TGGAAAATGG ACATACTGTA TTGTGTTCTC CTGCATATAT 
CATTCCTCGA GAGAGAAGGG GAGAAGAATA CTTTTCTTCA ACAAATTTTG GGGGCAGGAG 
ATCCCTTCAA GAGGCTGCAC CTTAATTTTT CTTGTCTGTG TGCAGGTCTT CATATAAACT 
TTACCAGGAA GAAGGGTGTG AGTTTGTTGT TTTTCTGTGT ATGGGCCTGG TCAGTGTAAA 
GTTTTATCCT TGATAGTCTA GTTACTATGA CCCTCCCCAC TTTTTTAAAA CCAGAAAAAG 
GTTTGGAATG TTCGAATGAC CAAGAGACAA GTTAACTCGT GCAAGAGCCA GTTACCCACC 
CACAGGTCCC CCTACTTCCT GCCAAGCATT CCATTGACTG CCTGTATGGA ACACATTTGT 
CCCAGATCTG AGCATTCTAG GCCTGTTTCA CTCACTCACC CAGCATATGA AACTAGTCTT 
AACTGTTGAG CCTTTCCTTT CATATCCACA GAAGACACTG TCTCAAATGT TGTACCCTTG 
CCATTTAGGA CTCAACTTTC CTTAGCCCAA GGGACCCAGT GACAGTTGTC TTCCGTTTGT 
CAGATGATCA GTCTCTACTG ATTATCTTGC TGCTTAAAGG CCTGCTCACC AATCTTTCTT 
TCACACCGTG TGGTCCG^GT TACTGGTATA CCAGTATGT TCTCACTGAA GACATGGACT 
TTATATGTTC AAGTGCAGGA ATTGGAAAGT TGGACTTGTT TTCIATGATC CAAAACAGCC 
CTATAAGAAG GTTGGAAAAG GAGGAACTAT ATAGCAGCCT TTGCTATTTT CTGCTACCAT 
TTCTTTTCCT CTGAAGCGGC CATGACATTC CCTTTGGCAA CTAACGTAGA AACTCAACAG 
AACATTTTCC TTTCCTAGAG TCACCTTTTA GATGATAATG GACAACTATA GACTTGCTCA 
TTGTTCAGAC TGATTGCCCC TCACCTGAAT CCACTCTCTG TATTCATGCT CTTGGCAATT 
TCTTTGACTT TCTTTTAAGG GCAGAAGCAT TTTAGTTAAT TCTAGATAAA GAATAGTTTT 
CTTCCTCTTC TCCTTGGGCC AGTTAATAAT TGGTCCATGG CTACACTGCA ACTTCCGTCC 
AGTGCTGTGA TGCCCATGAC ACCTGCAAAA TAAGTTCTGC CTGGGCATTT TGTAGATATT 
AACAGGTGAA TTCCCGACTC TTTTGGTTTG AATGACAGTT CTCATTCCTT CTATCGCTCC 
AAGTATGCAT CAGTGCTTCC CACTTACCTG ATTTGTCTGT CGGTGGCCCC ATATGGAAAC 
CCTGCGTGTC TGTTGGCATA ATAGTTTACA AATGGTTTTT TCAGTCCTAT CCAAATTTAT 
TGAACCAACA AAAATAATTA CTTCTGCCCT GAGATAAGCA GATTAAGTTT GTTCATTCTC 
TGCTTTATTC TCTCCATGTG GCAACATTCT GTCAGCCTCT TTCATAGTGT GCAAACATTT 
TATCATTCTA AATGGTGACT CTCTGCCCTT GGACCCATTT ATTATTCACA GATGGGGAGA 
ACCTATCTGC ATGGACCCTC ACCATCCTCT GTGCAGCACA CACAGTGCAG GGAGCCAGTG 
GCGATGGCGA TGACTTTCTT CCCCTG 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 657 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



2662 

2742 

2802 

2862 

2922 

2982 

3042 

3102 

3162 

3222 

3282 

3342 

3402 

3462 

3522 

3582 

3642 

3702 

3762 

3822 

3882 

3942 

4002 

4062 

4122 

4182 

4242 

4268 
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Olu Val Asp Val Leu Asp Val Asn Val Arg Oly Pro Asp Gly Cys Thr 

Pro Leu Met Leu Ala Ser Leu Arg Gly Gly Ser Ser Asp Leu Ser Asp 
20 

Glu Asp Glu Asp Ala Olu Asp Ser Ser Ala Asn lie lie Thr Asp Leu 



35 



val Tyr Gin Gly Ala Ser Leu Gin Ala Gin Thr Asp Arg Thr Gly Glu 

Met Ala Leu His Leu Ala' Ala Arg Tyr Ser Arg Ala Asp Ala Ala Lys 

65 70 75 

Arg Leu Leu Asp Ala Gly Ala Asp Ala Asn Ala Gin Asp Asn *et Gly 

85 90 

Arg eye Pro Leu His Ala Ala Val Ala Ala Asp . Ala Gin Gly Val Phe 



100 



Gin lie Leu lie Arg Asn Arg Val Thr Asp Leu Asp Ala Arg Met Asn 
H5 120 " s 

Asp Gly Thr Thr Pro Leu He Leu Ala Ala Arg Leu Ala Val Glu Gly 



130 



165 "0 
Glu Ala Thr Leu Leu Leu Leu Lys Asn Gly Ala Asn Arg Asp Met Gin 



Met Val Ala Olu Leu lie Asn Cys Gin Ala Asp Val Asn Ala Val Asp 

145 150 lbi> 

Asp Hie Gly Lys Ser Ala Leu His Trp Ala Ala Ala Val Asn Asn Val 

Asp 

180 185 "0 

Asp Asn Lys Glu Glu Thr Pro Leu Phe Leu Ala Ala Arg Glu Gly Ser 

195 200 

Tyr Glu Ala Ala Lys lie Leu Leu Asp His Phe Ala Asn Arg Asp He 

210 215 
Thr Asp His Met Asp Arg Leu Pro Arg Asp Val Ala Arg Asp Arg Met 

225 230 

His His Asp lie Val Arg Leu Leu Asp Glu Tyr Asn Val Thr Pro Ser 

245 250 

Pro Pro Gly Thr Val Leu Thr Ser Ala Leu Ser Pro Val lie Cys Gly 

260 265 
Pro Asn Arg Ser Phe Leu Ser Leu Lys His Thr Pro Met Gly Lys Lys 

275 

Ser Arg Arg Pro Ser Ala Lys Ser Thr Met Pro Thr Ser Leu Pro Asn 

290 295 
Leu Ala Lys Glu Ala Lys Asp Ala Lys Gly Ser Arg Arg Lys Lye Ser 
305 310 315 

Leu Ser Glu Lys Val Gin Leu Ser Glu Ser Ser Val Thr Leu Ser Pro 

325 JJU 
val Asp Ser Leu Glu S r Pro His Thr Tyr Val Ser Asp Thr Thr Ser 



Ser Pro Met 



340 345 
II Thr ser Pro Gly He Leu Gin Ala Ser Pro Asn Pro 
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355 360 365 

Met Leu Ala Thr Ala Ala Pro Pro Ala Pro Val His Ala Gin His Ala 



370 



I*u ser Phe ser Asn Leu His Glu Met Gin Pro Leu Ala His Gly Ala 



385 



390 395 



ser Thr Val Leu Pro Ser Val Ser Gin Leu Leu Ser His His His lie 



405 410 
Val Ser Pro Gly Ser Gly Ser Ala Gly Ser Leu Ser Arc, Leu His Pro 

420 ! 425 

Val Pro Val Pro Ala Asp Trp Met Asn Arc, Met Glu Val Asn Glu Thr 
435 440 

w«v oho riv Mat Val Leu Ala Pro Ala Glu Gly Thr 
Gin Tyr Asn Glu Met Phe Gly Met vax u ^ 

450 455 
His Pro Gly He Ala Pro Gin Ser Arg Pro Pro Glu Gly Lys His lie 
465 470 



Thr Thr Pro Arg, Glu Pro Leu Pro Pro lie val Thr Phe Gin Leu He 

485 490 
Pro Lys Gly Ser lie Ala Gin Pro. Ala Gly Ala Pro Gin Pro Gin Ser 



500 

Thr Cy. Pro Pro Ala Val Ala Gly Pro Leu Pro Thr Met Tyr Gin He 

515 520 
Pro Glu Met Ala Arg Leu Pro Ser Val Ala Phe Pro Thr Ala Met Met 

"530 535 
Pro Gin Gin Asp Gly Gin Val Ala Gin Thr lie Leu Pro Ala Tyr His 



545 



550 



Pro Phe Pro Ala Ser Val Gly Lys Tyr Pro Thr Pro Pro Ser Gin His 

565 570 
ser Tyr Ala Ser Ser Asn Ala Ala Glu Arg Thr Pro Ser His Ser Gly 

580 585 
His Leu Gin Gly Glu His Pro Tyr Leu Thr Pro Ser Pro Glu Ser Pro 



595 



600 



Asp Gin Trp Ser Ser Ser Ser Pro His Ser Ala Ser Asp Trp Ser Asp 



610 



Val-Thr Thr Ser Pro Thr Pro Gly Gly Ala Gly -y Gly Gin Arg Gly 



625 



630 



Pro Gly Thr His Mjt Ser Glu Pro Pro His Asn Asn Met Gin Val Tyr 



Ala 



(2) INFORMATION FOR SBQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 77 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : unknown 
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(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Glu Asp He Asp Glu Cys Asp Gin Gly Ser Pro Cys Glu His Asn Gly 
! 5 10 13 

Ilo Cys Val Asn Thr Pro Gly Ser Tyr Arg Cys Asn Cys Ser Gin Gly 
20 25 30 

Phe Thr Gly Pro Arg Cys Glu Thr Asn lie Asn Glu Cys Glu Ser His 
35 40 45 

Pro Cys Gin Asn Glu Gly Ser Cys Leu Asp Asp Pro Gly Thr Phe Arg 

50 55 
Cys Val Cys Met Pro Gly Phe Thr Gly Thr Gin Cys Glu 
65 70 75 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 amino .cids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(11) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 

Asn Asp Val Asp Glu Cys Ser Leu Gly Ala Asn Pro Cys Glu His Gly 
1 5 10 15 

Glv Arq Cys Thr Asn Thr Leu Gly Ser Phe Gin Cys Asn Cys Pro Gin 
y y J 20 25 30 

Gly Tyr Ala Gly Pro Arg Cys Glu He Asp Val Asn Glu Cys Leu Ser 
35 40 45 

Asn Pro Cys Gin Asn Asp Ser Thr Cys Leu Asp Gin He Gly Glu Phe 
50 55 60 

Gin Cys He Cys Met Pro Gly Tyr Glu Gly Leu Tyr Cys Glu 
65 70 75 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 654 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Thr Pro Pro Gin Gly Glu lie Glu Ala Asp Cys Met Asp Val Asn Val 
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15 10 15 

Arg Gly Pro Asp Gly Phe Thr Pro Leu Met lie Ala Ser Cys Ser Gly 
20 25 30 

Glv Gly Leu 3lu Thr Gly Asn Ser Glu Glu Glu Glu Asp Ala Ser Ala 
35 40 45 

Aan Met He Ser Asp Phe He Gly Gin Gly Ala Gin Leu His Asn Gin 
50 55 60 

Thr Asp Arg Thr Gly Glu Thr Ala Leu His Leu Ala Ala Arg Tyr Ala 

65 '■ 70 75 80 

Arg Ala Asp Ala Ala Lys Arg Leu Leu Glu ser Ser Ala Asp Ala Asn 
* 85 90 98 

Val Gin Asp Asn Met Gly Arg Thr Pro Leu His Ala Ala Val Ala Ala 
100 105 " iAU 

Asp Ala Gin Gly Val Phe Gin He Leu He Arg Asn Arg Ala Thr Asp 
115 120 1*5 

Leu Asp Ala Arg Met Phe Asp Gly Thr Thr Pro Leu lie Leu Ala Ala 
130 135 140 

Arg Leu Ala Val Glu Gly Met Val Glu Glu Leu He Asn Ala His Ala 
145 150 155 160 

Asp Val Asn Ala Val Asp Glu Phe Gly Lys Ser Ala Leu His Trp Ala 
165 170 l'» 

Ala Ala Val Asn Asn Val Asp Ala Ala Ala Val Leu Leu Lys Asn Ser 
180 I 85 

Ala Asn Lys Asp Met Gin Asn Asn Lys Glu Olu Thr Ser Leu Phe Leu 

195 200 20b 

Ala Ala Arg Glu Gly Ser Tyr Glu Thr Ala Lys Val Leu Leu Asp Hia 
210 215 220 

Tyr Ala Aan Arg Asp He Thr Asp His Met Asp Arg Leu Pro Arg Asp 
225 230 235 240 

He Ala Gin Glu Arg Met His His Asp lie Val His Leu Leu Asp Glu 
245 250 

Tyr Asn Leu Val Lys Ser Pro Thr Leu H'* Asn Gly Pro Leu Gly Ala 
260 265 270 

Thr Thr Leu Ser Pro Pro He Cys Ser Pro Asn Gly Tyr Met Gly Asn 
275 280 285 

Met Lys Pro Ser Val Gin Ser Lys Lys Ala Arg Lys Pro Ser He Lys 

Gly Asn Gly Cys Lys Glu Ala Lys Glu Leu Lys Ala Arg Arg Lys Lys 
305 31° 

Ser Gin Asp Gly Lys. Thr Thr Leu Leu Asp Ser Gly Ser Ser Gly Val 

325 330 JJ3 

Leu Ser Pro Val Asp Ser Leu Glu Ser Thr His Gly Tyr Leu Ser Asp 
340 345 J3U 

Val Ser Ser Pro Pro Leu Met Thr Ser Pro Phe Gin Gin Ser Pro Ser 

• 255 360 • 3t,:> 
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Met Pro Leu Asn His Leu Thr Ser Met Pro Glu Ser Gin Leu Gly Met 
370 375 380 

Asn His He Asn Met Ala Thr Lys Gin Glu Met Ala Ala Gly Ser Asn 
385 390 395 400 

Ara Met Ala Phe Asp Ala Met Val Pro Arg Leu Thr His Leu Asn Ala 
y 405 410 415 

ser Ser Pro Asn Thr He Met ser Asn Gly Ser Met His Phe Thr Val 
420 425 430 

Glv Glv Ala Pro Thr Met Asn Ser Gin Cys Asp Trp Leu Ala Arg Leu 
2 * 435 r 440 445 

Gin Asn Gly Met Val Gin Asn Gin Tyr Asp Pro He Arg Asn Gly He 
450 455 460 

Gin Gin Gly Asn Ala Gin Gin Ala Gin Ala Leu Gin His Gly Leu Met 
465 470 475 480 

Thr Ser Leu His Asn Gly Leu Pro Ala Thr Thr Leu Ser Gin Met Met 
485 490 495 

Thr Tyr Gin Ala Met Pro Asn Thr Arg Leu Ala Asn Gin Pro His Leu 
1 500 505 510 

Met Gin Ala Gin Gin Met Gin Gin Gin Gin Asn Leu Gin Leu His Gin 
515 520 525 

Ser Met Gin Gin Gin His His Asn Ser Ser Thr Thr Ser Thr His He 
530 535 540 

Asn Ser Pro Phe Cys Ser Ser Asp He Ser Gin Thr Asp Leu Gin Gin 
545 550 555 560 

Met Ser Ser Asn Asn He His Ser Val Met Pro Gin Asp Thr Gin He 
565 570 575 

Phe Ala Ala Ser Leu Pro Ser Asn Leu Thr Gin Ser Met Thr Thr Ala 
580 585 590 

Gin Phe Leu Thr Pro Pro Ser Gin His Ser Tyr Ser Ser Pro Met Asp 
595 600 605 

Asn Thr Pro Ser His Gin Leu Gin Val Pro As: His Pro Phe Leu Thr 
610 615 620 

Pro Ser Pro Glu Ser Pro Asp Gin Trp Ser Ser Ser Ser Pro His Ser 
62 5 630 635 640 

Asn Met Ser Asp Trp Ser Glu Gly He Ser Ser Pro Pro Thr 
645 650 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 666 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
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Thr Pro Pro Gin Gly Glu Val Asp Ala Asp Cys Met Asp Val Asn Val 
! 5 10 15 

Arg Gly Pro Asp Gly Phe Thr Pro Leu Met lie Ala Ser Cys Ser Gly 
20 25 30 

Gly Gly Leu Glu Thr Gly Asn Ser Glu Glu Glu Glu Asp Ala Pro Ala 
35 40 45 

Val He Ser Asp Phe He Tyr Gin Gly Ala Ser Leu His Asn Gin Thr 
50 55 60 

Asp Arg Thr Gly G,lu Thr Ala Leu His Leu Ala Ala Arg Tyr Ser Arg 

Ser Asp Ala Ala Lys Arg Leu Leu Glu Ala Ser Ala Aap Aia Asn He 
85 90 95 

Gin Asp Asn Met Gly Arg Thr Pro Leu His Ala Ala Val Ser Ala Asp 
100 105 HO 

Ala Gin Gly Val Phe Gin He Leu Leu Arg Asn Arg Ala Thr ABp Leu 
Xis 120 125 

Asp Ala Arg Met His Asp Gly Thr Thr Pro Leu He Leu Ala Ala Arg 
130 I 35 140 

Leu Ala val Glu Gly Met Leu Glu Asp Leu He Asn Ser His Ala Asp 
145 150 155 160 

Val Asn Ala Val Asp Asp Leu Gly Lys Ser Ala Leu His Trp Ala Ala 
165 170 175 

Ala val Asn Asn Val Asp Ala Ala Val Val Leu Leu Lys Asn Gly Ala 
180 185 190 

Asn Lys Asp Met Gin Asn Asn Lys Glu Glu Thr Pro Leu Phe Leu Ala 

Ala Arg Glu Gly Ser Tyr Glu Thr Ala Lys Val Leu Leu Asp His Phe 
210 215 220 

Ala Asn Arg Asp He Thr Asp His Met Asp Arg Leu Pro Arg Asp lie 
225 230 235 z * u 

Ala Gin Glu Arg Met His His Asp He Val Arg Leu Leu Asp Glu Tyr 
245 250 255 

Asn Leu Val Arg Ser Pro Gin Leu His Gly Thr Ala Leu Gly Gly Thr 
260 265 270 

" Pro Thr Leu Ser Pro Thr Leu Cys Ser Pro Asn Gly Tyr Leu Gly Asn 
275 280 285 

Leu Lys ser Ala Thr Gin Gly Lys Lys Ala Arg Lys Pro Ser Thr Lys 
290 295 300 

Gly Leu Ala Cys Ser Ser Lys Glu Ala Lys Asp Leu Lys Ala Arg Arg 
305 3 1° 

Lvfl Lys Ser Gin Asp Gly Lys Gly Cys Leu Leu Asp Ser Ser Ser Met 
1 * 325 330 335 

L u Ser Pr Val Asp Ser Leu Glu Ser Pro His Gly Tyr Leu Ser Asp 
340 34 5 350 



Val 



Ala S r Pro Pro L u Pro Ser Pro Phe Gin Gin Ser Pro Ser Met 
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355 360 365 

Pro Leu ser His Leu Pro Gly Met Pro Asp Thr His Leu Gly 11 Ser 
370 3 75 380 

His Leu Asn Val Ala Ala Lys Pro Glu Met Ala Ala Leu Ala Gly Gly 
385 390 395 400 

Ser Arg Leu Ala Phe Glu Pro Pro Pro Pro Arg Leu Ser Hie Leu Pro 
405 410 415 

Val Ala Ser Ser Ala Ser Thr Val Leu Ser Thr Asn Gly Thr Gly Ala 
420 , 425 430 

Met Asn Phe Thr Val Gly Ala Pro Ala Ser Leu Asn Gly Gin Cys Glu 
435 440 445 

Trp Leu Pro Arg Leu Gin Asn Gly Met Val Pro Ser Gin Tyr Asn Pro 
450 455 460 

Leu Arg Pro Gly Val Thr Pro Gly Thr Leu Ser Thr Gin Ala Ala Gly 
465 470 475 «»0 

Leu Gin His Gly Met Met Ser r.-o He His Ser Ser Leu Ser Thr Asn 
485 490 495 

Thr Leu Ser Pro He He Tyr Gin Gly Leu Pro Asn Thr Arg Leu Ala 
500 505 510 

Thr Gin Pro His Leu Val Gin Thr Gin Gin Val Gin Pro Gin Asn Leu 
515 520 525 

Gin He Gin Pro Gin Asn Leu Gin Pro Pro Ser Gin Pro His Leu Ser 
530 535 540 

Val Ser Ser Ala Ala Asn Gly His Leu Gly Arg Ser Phe Leu Ser Gly 
545 550 555 560 

Glu Pro Ser Gin Ala Asp Val Gin Pro Leu Gly Pro Ser Ser Leu Pro 
565 570 575 

Val His Thr He Leu Pro Gin Glu Ser Gin Ala Leu Pro Thr Ser Leu 
580 585 590 

Pro Ser Ser Met Val Pro Pro Met Thr Thr Thr Gin Phe Leu Thr Pro 
595 600 605 

Pro Ser Gin His Ser Tyr Ser Ser Ser Pro Val Asp Asn Thr Pro Ser 
610 615 620 

His Gin Leu Gin Val Pro Glu His Pro Phe Leu Thr Pro Ser Pro Glu 
625 630 635 640 

Ser Pro Asp Gin Trp Ser Ser Ser Ser Arg His Ser Asn He Ser Asp 

Trp Ser Glu Gly lie Ser Ser Pro Pro Thr 
660 665 



(2) INFORMATION FOR SEQ ID NOtlB: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 681 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: unknown 
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(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:18: 

Thr Pro Pro Gin Gly Glu Val Asp Ala Asp Cys Met Asp Val Asn Val 
1 5 10 15 

Arg Gly Pro Asp Gly Phe Thr Pro Leu Met lie Ala Ser Cys Ser Gly 
20 25 ^0 

Glv Gly Leu Glu Thr Gly Asn Ser Glu Glu Glu Glu Asp Ala Pro Ala 
35 40 45 

Val He Ser Asp Phe He Tyr Gin Gly Ala Ser Leu His Asn Gin Thr 
50 55 60 

Asp Arg Thr Gly Glu Thr Ala Leu His Leu Ala Ala Arg Tyr Ser Arg 
65 70 75 80 

Ser Asp Ala Ala Lys Arg Leu Leu Glu Ala Ser Ala Asp Ala Asn He 
85 90 95 

Gin Asp Asn Met Gly Arg Th • Pro Leu His Ala Ala Val Ser Ala Asp 
100 105 HO 

Ala Gin Gly Val Phe Gin He Leu He Arg Asn Arg Ala Thr Asp Leu 
115 120 I' 5 

Aso Ala Arg Met His Asp Gly Thr Thr Pro Leu He Leu Ala Ala Arg 
130 135 140 

Leu Ala Val Glu Gly Met Leu Glu Asp. Leu lie. Asn Ser His Ala Asp 
145 150 155 160 

Val Asn Ala Val Asp Asp Leu Gly Lys Ser Ala Leu His Trp Ala Ala 
165 170 175 

Ala Val Asn Aen Val Asp Ala Ala Val Val Leu Leu Lys Asn Gly Ala 
180 185 I 90 

Asn Lys Asp Met Gin Asn Asn Arg Glu Glu Thr Pro Leu Phe Leu Ala 

195 200 205 

Ala Arg Glu Gly ser Tyr Glu Thr Ala Lys Val Leu Leu Asp His Phe 
210 215 220 

Ala Asn Arg Asp He Thr Asp His Met Asp Arg Leu Pro Arg Asp He 

225 2^0 235 

Ala Gin Glu Arg Met His His Asp lie Val Arg Leu Leu Asp Glu Tyr 

245 250 

Asn Leu Val Arg Ser Pro Gin Leu His Gly Ala Pro Leu Gly Gly Thr 
260 265 27U 

Pro Thr Leu Ser Pro Pro Leu Cys ser Pro Asn Gly Tyr Leu Gly Ser 
275 280 285 

Leu Lys Pro Gly Val Gin Gly Lys Lys Val Arg Lys Pro Ser ser Lys 

290 295 300 

Gly Leu Ala Cys Gly Ser Lys Glu Ala Lys Asp Leu Lys Ala Arg Arg 
305 31° 

Lys Lys Ser Gin Asp Gly Lys Gly Cys Leu Leu Asp Ser Ser Gly Met 
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325 330 335 

Leu Ser Pro Val Asp Ser Leu Glu Ser Pro His Gly Tyr Leu Ser Asp 
340 345 350 

Val Ala Ser Pro Pro Leu Leu Pro Ser Pro Phe Gin Gin Ser Pro Ser 
355 360 365 

Val Pro Leu Asn His Leu Pro Gly Met Pro Asp Thr His Leu Gly He 
370 375 380 



Gly His Leu Asn Val Ala Ala Lys Pro Glu Met Ala Ala Leu Gly Gly 
385 i 390 395 «u« 

Cly Gly Arc Leu Ala Phe Glu Thr Gly Pro Pro Arg Leu ser His Leu 
405 410 415 

Pro Val Ala Ser Gly Thr ser Thr Val Leu Gly Ser Ser ser Gly Gly 
420 425 430 

Ala Leu Asn Phe Thr Val Gly Gly ser Thr ser Leu Asn Gly Gin Cys 
435 440 445 

Glu Trp Leu Ser Arg Leu Gin Ser Gly Met Val Pro Asn Gin Tyr Asn 
450 455 460 

Pro Leu Arg Gly Ser Val Ala Pro Gly Pro Leu Ser Thr Gin Ala Pro 
465 470 475 480 

Ser Leu Gin His Gly Met Val Gly Pro Leu His Ser Ser Leu Ala Ala 
485 490 495 

Ser Ala Leu Ser Gin Met Met Ser Tyr Gin Gly Leu Pro Ser Thr Arg 
500 505 510 

Leu Ala Thr Gin Pro His Leu Val Gin Thr Gin Gin Val Gin Pro Gin 
515 520 525 

Asn Leu Gin Met Gin Gin Gin Asn Leu Gin Pro Ala Asn He Gin Gin 
530 535 540 

Gin Gin Ser Leu Gin Pro Pro Pro Pro Pro Pro Gin Pro His Leu Gly 
545 550 555 560 

Val Ser Ser Ala Ala Ser Gly His Leu Gly Arg Ser Phe Leu Ser Gly 
565 570 57S 

Glu Pro Ser Gin Ala Asp Val Gin Pro Leu Gly Pro Ser Ser Leu Ala 
580 585 590 

Val His Thr He Leu Pro Gin Glu Ser Pro Ala Leu Pro Thr Ser Leu 
595 600 »05 

Pro Ser Ser Leu Val Pro Pro Val Thr Ala Ala Gin Phe Leu Thr Pro 
610 615 62 

Pro Ser Gin His Ser Tyr Ser Ser Pro Val Glu Asn Thr Pro Ser His 
625 630 635 

Gin Leu Gin Val Pro Glu His Pro Phe Leu Thr Pro Ser Pro Glu Ser 
645 650 655 

Pro Asp Gin Trp Ser Ser Ser Ser Pro His Ser Asn Val Ser Asp Trp 
660 665 670 

Ser Glu Gly Val Ser Ser Pro Pro Thr 
675 680 



WO 94/07474 



PCT/US93/09338 



-121- 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2471 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Met Pro Ala Leu Arg Pro Ala Leu Leu Trp Ala Leu Leu Ala Leu Trp 
15 10 15 

Leu Cys Cys Ala Ala Pro Ala His Ala Leu Gin Cys Arg Asp Gly Tyr 
20 25 30 

Glu Pro Cys Val Asn Glu Gly Met Cys Val Thr Tyr His Asn Gly Thr 
35 40 45 

Gly Tyr Cys Lys Cys Pro Glu Gly Phe Leu Gly Glu Tyr Cys Gin His 
50 55 60 

Arq Asp Pro Cys Glu Lys Asn Arg Cys Gin Asn Gly Gly Thr Cys Val 
65 70 75 80 

Ala Gin Ala Met Leu Gly Lys Ala Thr Cys Arg Cys Ala Ser Gly Phe 
85 90 95 

Thr Gly Glu Asp Cys Gin Tyr Ser Thr Ser His Pro Cys Phe Val Ser 
100 105 110 

Arq Pro Cys Leu Asn Gly Gly Thr Cys His Met Leu Ser Arg Asp Thr 
115 120 125 

Tyr Giu Cys Thr Cys Gin Val Gly Phe Thr Gly Lys Glu Cys Gin Trp 
130 135 140 

Thr Asp Ala Cys Leu Ser His Pro Cys Ala Asn Gly Ser Thr Cys Thr 
145 150 155 160 

Thr Val Ala Asn Gin Phe Ser Cys Lys Cys Leu Thr Gly Phe Thr Gly 
165 170 175 

Gin Lys Cys Glu Thr Asp Val Asn Glu Cys Asp He Pro Gly His Cys 
180 185 190 

Gin His Gly Gly Thr Cys Leu Asn Leu Pro C^y Ser Tyr Gin Cys Gin 
195 200 205 

Cys Pro Gin Gly Phe Thr Gly Gin Tyr Cys Asp Ser Leu Tyr Val Pro 
210 215 220 

Cys Ala Pro Ser Pro Cys Val Asn Gly Gly Thr Cys Arg Gin Thr Gly 
225 230 235 240 

Asp Phe Thr Phe Glu Cys Asn Cys Leu Pro Gly Phe Glu Gly Ser Thr 
245 250 255 

Cvs Glu Arg Asn He Asp Asp Cys Pro Asn His Arg Cya Gin Asn Gly 
* 260 265 270 

Gly Val Cys Val Asp Gly Val Asn Thr Tyr Asn Cys Arg Cys Pro Pro 
275 280 285 



# 
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Gin Trp Thr Oly Gin Ph Cys Thr Glu Asp Val Asp Glu Cys Leu Leu 

290 295 
Gin Pro Asn Ala Cys Gin Asn Gly Gly Thr Cys Ala Asn Arg Asn Gly 
305 310 

Gly Tyr Gly Cys Val Cys Val Asn Gly Trp Ser Gly Asp Ab P Cys Ser 
Glu Asn lie Asp Asp Cys Ala Phe Ala Ser Cys Thr Pro Gly Ser Thr 



340 



Cys lie Asp Arg Val Ala Ser Phe Ser Cys Met Cys Pro Glu Gly Lys 

355 360 
Ala Gly Leu Leu Cys His Leu Asp Asp Ala Cys lie Asn Pro Cys 

370 375 
His Lys Gly Ala Leu Cys Asp Thr Asn Pro Leu Asn Gly Gin Tyr lie 
385 390 

Cys Thr Cys Pro Gin Gly Tyr Lys Gly Ala Asp Cys Thr Glu Asp Val 
' 405 410 

Asp Clu Cys Ala Met Ala Asn ser Asn Pro Cys Glu His Ala Gly Lys 
420 425 

Cys Val Asn Thr Asp Gly Ala Phe His Cys Glu Cys Leu Lys Gly Tyr 

435 440 " 5 

Ala Gly Pro Arg Cys Glu Met Asp He Asn Glu Cys His Ser Asp Pro 

450 455 
cys Gin Asn Asp Ala Thr Cys Leu Asp Lys lie Gly Gly Phe Thr Cys 

465 470 475 

Leu cys Met Pro Gly Phe Lys Gly Val His Cys Glu Leu Glu lie Asn 
485 490 

Glu Cys Gin ser Asn Pro Cys Val Asn Asn Gly Gin Cys Val Asp Lys 
500 SO 5 510 

val Asn Arg Phe Gin Cys Leu Cys Pro Pro Gly Phe Thr Gly Pro Val 
515 520 525 

Cys Gin He Asp He Asp Asp Cys Ser ser Thr Pro Cys Leu Asn Gly 

530 535 
Ala Lys Cys He Asp His Pro Asn Gly Tyr Glu Cys Gin Cys Ala Thr 
545 550 

Gly Phe Thr Gly Val Leu Cys Glu Glu Asn He Asp Asn Cys Asp Pro 
* 565 570 3 

Asp Pro cys His His Gly Gin Cys Gin Asp Gly He Asp Ser Tyr Thr 

580 585 
Cys He Cys Asn Pro Gly Tyr Met Gly Ala lie Cys Ser Asp Gin lie 

Asp Glu Cys Tyr Ser Ser Pro Cys Leu Asn Asp Gly Arg Cys He Asp 

610 615 
Leu Val Asn Gly Tyr Gin Cys Asn Cys Gin Pro Gly Thr Ser Gly Val 
625 630 

Asn Cys Glu He Asn Phe Asp Asp Cys Ala Ser Asn Pro Cys He His 
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645 650 655 

Gly lie Cya Mat Asp Gly lie ABn Arg Tyr Ser Cys Val Cys Ser Pro 
660 665 670 

Gly Phe Thr Gly Gin Arg Cys Asn lie Asp He Asp Glu Cys Ala Ser 
675 680 685 

Asn Pro Cya Arg Lys Gly Ala Thr Cys lie Asn Gly Val Asn Gly Phe 
690 695 700 

Arg Cys He Cye Pro Glu Gly Pro His His Pro Ser Cys Tyr Ser Gin 
70S , 710 715 720 

Val Asn Glu Cys Leu Ser Asn Pro Cys lie His Gly Asn Cys Thr Gly 



725 



730 735 



Gly Leu Ser Gly Tyr Lys Cys Leu Cys Asp Ala Gly Trp Val Gly lie 
740 745 750 

Asn Cys Glu Val Asp Lye Asn Glu Cys Leu Ser Asn Pro Cys Gin Asn 

* 760 765 



755 



Gly Gly Thr Cys Asp Asn Leu Val Asn Gly Tyr Arg Cys Thr Cys Lys 
' 770 775 780 

Lys Gly Phe Lye Gly Tyr Asn Cys Gin Val Asn He Asp Glu Cys Ala 
785 790 795 

Ser Asn Pro Cya Leu Asn Gin Gly Thr Cys Phe Asp Asp He Ser Gly 
805 810 

Tvr Thr Cys Hie Cys Val Leu Pro Tyr Thr Gly Lye Asn Cys Gin Thr 

. 820 825 830 

Val Leu Ala Pro Cys Ser Pro Asn Pro Cys Glu Asn Ala Ala Val Cys 
835 340 845 

Lvs Glu Ser Pro Asn Phe Glu Ser Tyr Thr Cys Leu Cys Ala Pro Gly 
850 355 860 

Trp Gin Gly Gin Arg Cys Thr lie Asp lie Asp Glu Cys lie Ser Lye 
865 .370 87 5 880 

Pro Cys Met Asn His Gly Leu Cys His Asn Thr Gin Gly Ser Tyr Met 
* 885 890 895 

Cvs Glu Cye Pro Pro Gly Phe Ser Gly Met *sp Cys Glu Glu Asp He 
y ' goo 9 <55 9 10 

Asp Asp Cys Leu Ala Asn Pro Cys Gin Asn Gly Gly Ser Cys Met Asp 
r * gj S 920 " 5 

Cly Val Asn Thr Phe Ser Cys Leu Cys Leu Pro Gly Phe Thr Gly Asp 
7 930 935 940 

Lys Cys Gin Thr Asp Met Asn Glu cys Leu Ser Glu Pro Cys Lys Asn 
945 950 95= 

Gly Gly Thr Cys Ser Asp Tyr val Asn Ser Tyr Thr Cys Lys Cys Gin 



965 



Ala Gly Phe Asp Gly Val His Cys Glu Asn Aan He Asn Glu Cys Thr 

980 9SS " u 

Glu Ser ser Cye Ph Asn Gly Gly Thr Cys Val Asp Gly lie Asn Ser 

995 1000 
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Phe Ser Cys Leu Cys Pro Val Oly Phe Thr Gly SerPhe Cye Leu Hie 

1010 1015 
Glu lie Aan Glu Cye Ser Ser Hie Pro Cys Leu Asn Glu Gly Thr Cy^ 

1025 1030 

val Asp Gly Leu Cly^Thr Tyr Arg Cys Ser^ys Pro Leu Gly Ty^Thr 

Gly Lys Asn cys Gin Thr Leu Val Asn Leu Cys Ser Arg Ser Pro Cya 
1060 1065 J,u 

Lys Asn Lys Gly Thr Cys Val Gin Lys Lys Ala Glu Ser Gin Cys Leu 
* 1075 1080 1° BS 

Cys Pro Ser Gly Trp Ala Gly Ala Tyr Cys Asp Val Pro Asn Val Ser 

1090 1° 95 1100 

Cys Asp lie Ala Ala Ser Arg Arg Gly Val Leu Val Glu His Leu Cy. 

1105 1110 

Gin His Ser Gly Val Cys lie Asn Ala Gly Asn Thr His Tyr Cys Gin 
1125 H30 

cys Pro Leu Gly Tyr Thr 01- Ser Tyr Cys Glu Glu Gin Leu Asp Glu 
1140 1145 A19U 

Cys Ala Ser Asn Pro Cys Gin His. Gly Ala Thr Cys Ser Asp Phe lie 
1155 ll 60 1165 

Gly Gly Tyr Arg Cys Glu Cys Val Pro Gly Tyr Gin Gly Val Asn Cya 
1170 H75 1180 

. Glu Tyr Glu Val Asp Glu Cys Gin Asn Gin Pro Cys Gin Asn Gly Oly 

' J 1190 1195 i^uw 



1185 



1190 



Thr Cys lie Asp Leu Val Asn His Phe Lys Cys Ser Cys Pro Pro Gly 
1205 1210 

Thr Arg Gly Leu Leu Cys Glu Glu Asn He Asp Asp Cys Ala Arg Gly 
1220 1225 1230 

Pro His Cys Leu Asn Gly Gly Gin Cys Met Asp Arg lie Gly Gly Tyr 
1235 124 0 1245 

Ser Cys Arg Cys Leu Pro Gly Phe Ala Gly Gl«- Arg Cya Glu Gly Asp 
1250 1255 1260 

lie Asn Glu Cys Leu Ser Asn Pro Cya Ser Ser Glu Gly Ser Leu Ajp 
1265 1270 1 

Cys lie Gin Leu Thr Asn Asp Tyr Leu Cys Val Cya Arg Ser Ala Phe 
J 1285 1250 xz« 

Thr Gly Arg Hia Cys Glu Thr Phe Va^Asp Val Cya Pro GlnMet Pro 



1300 



Cys Leu Asn Gly Gly Thr Cys Ala Val Ala Ser Asn Met Pro Asp Gly 

1315 132 0 
Phe lie Cys Arg Cys Pro Pro Gly Phe Ser Gly Ala Arg Cys Gin Ser 
1330 1335 " 

Ser Cys Gly Gin Val Lys Cys Arg Lys Gly Glu Gin Cya Val Hia Thr 

1345 1350 13bb 

Ale Ser Gly Pro Arg Cys Phe Cys Pro Ser Pro Arg Asp Cys Glu Ser 
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1365 1370 1375 

Glv Cys Ala Ser Ser Pro Cys Gin His Gly Gly Ser Cys His Pro Gin 

* 1380 1385 1390 

Ara Gin Pro Pro Tyr Tyr S r Cys Gin Cys Ala Pro Pro Phe Ser Gly 

* 1395 1400 1405 

Ser Arq Cys Glu Leu Tyr Thr Ala Pro Pro Ser Thr Pro Pro Ala Thr 
14 10 1415 1420 

Cys Leu Ser Gin Tyr Cye Ala Asp Lys Ala Arg Asp Gly Val Cys Asp 
1425 , 1430 1435 1440 

Glu Ala cys Asn Ser His Ala Cys Gin Trp Asp Gly Gly Asp Cys Ser 
1445 1450 1455 

Leu Thr Met Glu Asn Pro Trp Ala Asn Cys Ser Ser Pro Leu Pro Cys 
1460 1465 1470 

Trp Asp Tyr He Asn Asn Gin Cys Asp Glu Leu Cys Asn Thr Val Glu 
1475 1480 1485 

Cys Leu Phe Asp Asn Phe Glu Cys Gin Gly Asn Ser Lys Thr Cys Lys 
1490 1495 1500 

Tyr Asp Lys Tyr Cys Ala Asp His Phe Lys Asp Asn His Cys Asn Gin 
1505 1510 1515 1520 

Gly Cys Asn Ser Glu Glu Cys Gly Trp Asp Gly Leu Asp Cys Ala Ala 
■ - 1525 1530 1535 

Asp Gin Pro Glu Asn Leu Ala Glu Gly Thr Leu Val He Val Val Leu 
1540 1545 1550 

Met Pro Pro Glu Gin Leu Leu Gin Asp Ala Arg Ser Phe Leu Arg Ala 
1555 1560 1565 

Leu Gly Thr Leu Leu His Thr Asn Leu Arg He Lys Arg Asp Ser Gin 
1570 1575 1580 

Glv Glu Leu Met Val Tyr Pro Tyr Tyr Gly Glu Lys Ser Ala Ala Met 
1585 1590 1595 1600 

Lys Lys Gin Arg Met Thr Arg Arg Ser Leu Pro Gly Glu Gin Glu Gin 
1605 1610 lblb 

Glu Val Ala Gly Ser Lys Val Phe Leu Glu He Asp Asn Arg Gin Cys 
1620 I 625 1630 

Val Gin Asp Ser Asp His Cys Phe Lys Asn Thr Asp Ala Ala Ala Ala 
1635 1640 1645 

Leu Leu Ala Ser His Ala He Gin Gly Thr Leu Ser Tyr Pro Leu Val 
1650 1655 I 660 

Ser Val Val Ser Glu Ser Leu Thr Pro Glu Arg Thr Gin Leu Leu Tyr 
1665 1670 1675 16S0 

Leu Leu Ala Val Ala Val Val He He Leu Phe He He Leu LeuGly 
1685 1690 1695 

val lie Met Ala Lys Arg Lys Arg Lys His Gly Ser Leu Trp Leu Pro 
1700 1705 1"1° 

Glu Gly Ph Thr Leu Arg Arg Asp Ala Ser Asn His Lys Arg Arg Glu 
17 15 1720 1725 
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Pro Val Gly Gin Asp Ala ja^Gly Leu Lys Asn tauter Val Gin Val 



ser Glu Ala Asn Leu lie Gly Thr Gly Thr Ser Glu His Trp Val Asp 
1745 "50 "55 



Asp Glu Gly Pro Gin Pro Lys Lys Val Lys Ala Glu Asp Glu Ala Leu 

r 1765 1770 x/ra 

Leu ser Glu Glu Asp Asp Pro He Asp Arg Arg Pro Trp Thr Gin Gin 
1780 1785 

His Leu Gl^Ala A f a Asp lie Arg^Arg Thr Pro Ser Le^Ala Leu Thr 

Pro Pro Gin Ala Glu Gin Glu Val Asp Val Leu Asp Val Asn Val Arg 
1810 1815 1820 

Gly Pro Asp Gly Cys Thr Pro Leu Met Leu Al^Ser Leu Arg Gly ol^ 

1825 

Ser ser Asp Leu Ser Asp Glu Asp Glu Asp Ala Glu Asp Ser Serbia 



1845 



Asn lie lie Thr Asp Leu Val Tyr Gin Gly Ala Ser Leu Gin Ala Gin 
I860 1865 

Thr Asp Arg Thr Gly Glu Met Ala Leu His Leu Ala Ala Arg Tyr Ser 

1875 1880 

Arg Ala Asp Ala Ala Lys Arg Leu Leu Asp Ala Gly Ala Asp Ala Asn 
1890 1895 1900 

Ala Gin Asp Aan Met Gly Arg Cys Pro Leu His Ala Ala Val Ala Ala 
1905 1 9 1° 1915 

Asp Ala Gin Gly Val Phe Gin lie Leu lie Arg Asn Arg Val Thr Asp 

F 1925 1930 

Leu Asp Ala Arg Met Asn Asp Gly Thr Thr Pro Leu lie Leu Ala Ala 
1940 I 945 A ' 3U 

Arg Leu Ala val Glu Gly Met Val Ala Glu Leu lie Asn Cya Gin Ala 
" 1960 



1955 



Asp Val Asn Ala Val Asp Asp His Gly Lys Ser Ala Leu His Trp Ala 
P 1970 1975 "80 

Alalia Val Asn Asn Val^Glu Ala Thr Leu Le^Leu Leu Lys Asn Gly^ 

Ala Asn Arg Asp Met Gin Asp Asn Lys Glu Glu Thr Pro Leu Phe Leu 
2005 201O 

Ala Ala Arg Glu Gly Ser Tyr Glu Ala Ala Lys He Leu Leu Asp His 
2020 2025 

Phe Ala Asn Arg Asp lie Thr Asp His Met Asp Arg Leu Pro Arg Asp 
2035 2040 2045 

Val Ala Arg Asp Arg Met His His Asp lie Val Arg Leu Leu Asp Glu 
2050 2055 20b0 

Tyr Asn Val Thr Pro Ser Pro Pro Gly Thr Val Leu Thr Ser Ala LeU Q 
2065 2070 

Ser Pro val He cys Gly Pro Asn Arg Ser Phe Leu Ser Leu Lys His 
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2085 2090 2095 

Thr Pro Met Gly Lys Lys Ser Arg Arg Pro Ser Ala Lys Ser Thr Het 
2100 2105 2110 

Pro Thr Ser Leu Pro Asn Leu Ala Lys Glu Ala Lys Asp Ala Lys Gly 
2H5 2120 2125 

Ser Arg Arg Lys Lys Ser Leu Ser Glu Lys Val Gin Leu Ser Glu Ser 
2130 2135 2140 

Ser Val Thr Leu Ser Pro Val Asp Ser Leu Glu Ser Pro His Thr Tyr 
2145 ' 2150 2155 2160 

Val Ser Asp Thr Thr Ser Ser Pro Met lie Thr Ser r.-a Gly lie Lou 
2165 2170 .. 2175 

Gin Ala Ser Pro Asn Pro Met Leu Ala Thr Ala Ala Pro Pro Ala Pro 
2180 ■ 2185 2190 

Val His Ala Gin His Ala Leu Ser Phe Ser Asn Leu His Glu Met Gin 
2195 2200 2205 

Pro Leu Ala His Gly Ala Ser Thr Val Leu Pro Ser Val Ser Gin Leu 
2210 2215 2220 

Leu Ser Hie His His lie Val Ser Pro Gly Ser Gly Ser Ala Gly Ser 
2225 2230 2235 2240 

Leu Ser Arg Leu His Pro Val Pro Val Pro Ala Asp Trp Met Asn Arg 
2245 2250 2255 

Met Glu Val Asn Glu Thr Gin Tyr Asn Glu Met Phe Gly Met Val Leu 
2260 2265 2270 

Ala Pro Ala Glu Gly Thr His Pro Gly lie Ala Pro Gin Ser Arg Pro 
2275 2280 2285 

Pro Glu Gly Lys His He Thr Thr Pro Arg Glu Pro Leu Pro Pro He 
2290 2295 2300 

Val Thr Phe Gin Leu lie Pro Lys Gly Ser lie Ma Gin Pro Ala Gly 
2305 2310 2315 2320 

Ala Pro Gin Pro Gin Ser Thr Cys Pro Pro Ala Val Ala Gly Pro Leu 
2325 2330 

Pro Thr Met Tyr Gin He Pro Glu Met Ala Arg Leu Pro Ser Val Ala 
2340 2345 2350 

-Phe Pro Thr Ala Met Met Pro Gin Gin Asp Gly Gin Val Ala Gin Thr 
2355 2360 .2365 

lie Leu Pro Ala Tyr His Pr^Phe Pro Ala Ser Vainly Lys Tyr Pro 

Thr Pro Pro Ser Gin His Ser Tyr Ala Ser Ser Asn Ala Ala Glu Arg^ 
2385 2390 

Thr Pro Ser His Ser Gly His Leu Gin Gly Glu His Pro Tyr Leu Thr 
2405 2410 

Pro Ser Pro Glu Ser Pro Asp Gin Tr P ser Ser Ser Ser Pro His Ser 
2420 2425 ^uu 

Ala Ser Asp Trp Ser Asp Val Thr Thr Ser Pro Thr Pro Gly. Gly Ala 
2435 2440 ***** 



-128- 

Gly Gly Gly Gin Arg Gly Pro Gly Thr His Met Sor Glu Pro Pro His 
2450 2455 2460. 

Asn Asn Met Gin Val Tyr Ala 
2465 2470 

INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2556 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY » unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Pro Pro Leu Leu Ala Pro Leu Leu Cys Leu Ala Leu Leu Pro Ala 
1 5 io 15 

Leu Ala Ala Arg Gly Pro Arg Cys Ser Gin Pro Gly Glu Thr Cys Leu 
20 25 30 

Asn Gly Gly Lys Cys Glu Ala Ala Asn Gly Thr Glu Ala Cys Val Cys 
35 40 45 

Glv Gly Ala Phe Val Gly Pro Arg Cys Gin Asp Pro Asn Pro Cys Leu 
50 55 60 

Ser Thr Pro Cys Lys Asn Ala Gly Thr Cys His Val Val Asp Arg Arg 
65 70 75 80 

Gly Val Ala Asp Tyr Ala Cys Ser Cys Ala Leu Gly Phe Ser Gly Pro 
85 90 95 

Leu Cys Leu Thr Pro Leu Asp Asn Ala Cys Leu Thr Asn Pro Cys Arg 
100 1° 5 110 

Asn Gly Gly Thr Cys Asp Leu Leu Thr Leu Thr Glu Tyr Lys Cys Arg 
J US 120 125 

Cvs Pro Pro Gly Trp Ser Gly Lys Ser Cys Gin Gin Ala Asp Pro Cys 
7 130 135 140 

Ala Ser Asn Pro Cys Ala Asn Gly Gly Gin Cys Leu Pro Phe Glu Ala 
145 150 155 160 

Ser Tyr lie Cys His Cys Pro Pro Ser Phe His Gly Pro Thr Cys Arg 
165 170 175 

Gin Asp Val Asn Glu Cys Gly Gin Lys Pro Arg Leu Cys Arg His Gly 

Gly Thr Cys His Asn Glu Val Gly Ser Tyr Arg Cys Val Cys Arg Ala 

Thr His Thr Gly Pro Asn Cys Glu Arg Pro Tyr Val Pro Cys Ser Pro 
210 215 220 

Ser Pro Cys Gin Asn Gly Gly Thr Cys Arg Pro Thr Gly Asp Val Thr 
225 230 235 240 

His Glu Cys Ala Cys Leu Pro Gly Phe Thr Gly Gin Asn Cys Glu Glu 
245 250 255 
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Asn He Asp Asp cys Pro Gly Asn Asn Cya Lys Asn Gly Gly Ala Cys 
260 265 270 

Val Asp Gly Val Asn Thr Tyr Asn Cys Pro Cys Pro Pro Glu Trp Thr 
275 280 285 

Gly Gin Tyr Cys .Thr Glu Asp Val Asp Glu Cys Gin Leu Met Pro Asn 
290 295 300 

Ala Cys Gin Asn Gly Gly Thr Cys His Asn Thr His Gly Gly Tyr Asn 
305 310 315 320 

Cys Val Cys Val Asn Gly Trp Thr Gly Glu Asp Cys Ser Glu Asn He 
325 330 335 

Asp Asp Cy« Ala Ser Ala Ala Cys Phe His Gly Ala Thr Cys His Asp 
340 345 350 

Arg Val Ala Ser Phe Tyr Cys Glu Cys Pro His Gly Arg Thr Gly Leu 
355 360 365 

Leu Cys His Leu Asn Asp Ala Cys He Ser Asn Pro Cys Asn Glu Gly 
370 375 380 

Ser Asn Cys Asp Thr Asn Prr Val Asn Gly Lys Ala He Cys Thr Cys 
385 390 395 400 

Pro Ser Gly Tyr Thr Gly Pro Ala Cys Ser Gin Asp Val Asp Glu Cys 
405 410 415 

Ser Leu Gly Ala Asn Pro Cys Glu His Ala Gly Lys Cys He Asn Thr 
420 425 430 

Leu Gly Ser Phe Glu Cys Gin Cys Leu Gin Gly Tyr Thr Gly Pro Arg 
435 440 445 

Cys Glu He Asp Val Asn Glu Cys Val Ser Asn Pro Cys Gin Asn Asp 
450 455 460 

Ala Thr Cys Leu Asp Gin He Gly Glu Phe Gin Cys Met Cys Met Pro 
465 470 475 480 

Gly Tyr Glu Gly Val His Cys Glu Val Asn Thr Asp Glu Cys Ala Ser 
485 490 495 

Ser Pro Cys Leu His Asn Gly Arg Cys Leu Asp Lys He Asn Glu Phe 
500 505 510 

Gin Cys Glu Cys Pro Thr Gly Phe Thr Gly His Leu Cys Gin Tyr Asp 
515 520 525 

Val Asp Glu Cys Ala Ser Thr Pro Cys Lys Asn Gly Ala Lys Cys Leu 
530 535 540 

Asp Gly Pro Asn Thr Tyr Thr Cys Val Cys Thr Glu Gly Tyr Thr Gly 
545 550 555 560 

Thr His Cys Glu val Asp He Asp Glu Cys Asp Pro Asp Pro Cys His 
565 570 575 

Tvr Glv Ser Cys Lys Asp Gly Val Ala Thr Phe Thr Cys Leu Cys Arg 
* 1 580 585 590 

Pro Gly Tyr Thr Gly His His Cys Glu Thr Asn He Asn Glu Cys Ser 
595 600 605 

Ser Gin Pro Cys Arg Leu Arg Gly Thr Cys Gin Asp Pro Asp Asn Ala 
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615 620 



Tyr Leu cys Phe Cys Leu Lys Cly Thr Thr Gly Pro Asn Cys Glu lie 
625 630 635 &40 

Asn Leu Asp Asp Cys Ala Ser Ser Pro Cys Asp Ser Gly Thr Cys Leu 

645 650 ODa 

Asp Lys lie Asp Gly Tyr Glu Cys Ala Cys Glu Pro Gly Tyr Thr Gly 
660 . 665 6 

Ser Met Cys Asn Ser Asn lie Asp Glu Cys Ala Gly Asn Pro Cys His 
675 680 685 

Asn Gly Gly Thr Cys Glu Asp Gly He Asn Gly Phe Thr Cys Arg Cys 
690 695 700 

Pro Glu Gly Tyr His Asp Pro Thr Cys Leu Ser Glu Val Asn Glu Cys 

705 710 

Asn Ser Asn Pro Cys Val His Cly Ala Cys Arg Asp Ser Leu Asn Gly 

725 '30 

Tyr Lys Cys Asp Cys Asp Pro Gly Trp Ser Gly Thr Asn Cys Asp He 
' 740 745 750 

Asn Aan Asn Clu Cys Glu Ser Asn Pro Cys val Asn Gly Gly Thr Cys 
755 76 0 765 

Lys Asp Met Thr Ser Gly He Val Cys Thr Cys Arg Glu Gly Phe Ser 
* 770 775 780 

Gly Pro Asn Cys Cln Thr Asn lie Asn Glu Cys Ala ser Asn Pro Cys 
785 790 795 «»°u 

Leu Asn Lys Gly Thr Cys lie Asp Asp Val Ala Gly Tyr Lys Cys Asn 
805 810 815 

Cys Leu Leu Pro Tyr Thr Gly Ala Thr Cys Glu Val Val Leu Ala Pro 
J 820 825 830 

Cvs Ala Pro Ser Pro Cys Arg Asn Gly Gly Glu cys Arg Gin Ser Glu 
835 840 845 

Asp Tyr Glu Ser Phe Ser Cys Val Cys Pro Thr Ala Gly Ala Lys Gly 

850 855 860 

Gin Thr Cys Glu Val Asp He Asn Glu Cys Val Leu Ser Pro Cys Arg 

His Gly Ala ser Cys Gin Asn Thr His Gly Gly Tyr Arg Cys His Cys 
" 885 890 89 

Gin Ala Gly Tyr Ser Gly Arg Asn Cys Glu Thr Asp lie Asp Asp Cys 

900 905 

Arg Pro Asn Pro Cys His Asn Gly Gly Ser Cys Thr Asp Gly He Asn 

915 920 y " 

Thr Ala Phe Cys Asp Cys Leu Pro Gly Phe Arg Gly Thr Phe Cys Glu 
930 935 940 

Glu Asp He Asn Glu Cys Ala Ser Asp Pro Cys Arg Asn Gly Ala Asn 
945 950 955 » ou 

Cys Thr Asp Cys Val Asp Ser Tyr Thr Cys Thr Cys Pro Ala Gly Phe 

J 965 970 
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Ser Gly He His Cys Glu Asn Asn Thr Pro Asp Cys Thr Glu Ser Ser 
980 985 990 

Cys Phe Asn Gly Gly Thr Cys Val Asp Gly He Asn ser Phe Thr Cys 
995 1000 1005 

Leu Cys Pro Pro Gly Phe Thr Gly Ser Tyr Cys Gin His Val Val Asn 
1010 1015 1020 

Glu Cys Asp Ser Arg Pro Cys Leu Leu Gly Gly Thr Cys Gin Asp Gly 
1025 1030 1035 1040 

Arg Gly Leu His Arg Cys Thr Cys Pro Gin Gly Tyr Thr Gly Pro Asn 
10*45 1050 1055 

Cys Gin Asn Leu Val His Trp Cys Asp Ser Ser Pro Cys Lys Asn Gly 
1060 1065 1070 

Gly Lys Cys Trp Gin Thr His Thr Gin Tyr Arg Cys Glu Cys Pro Ser 
1075 1080 1085 

Gly Trp Thr Gly Leu Tyr Cys Asp Val Pro Ser Val Ser Cys Glu Val 
1090 1095 1100 

Ala Ala Gin Arg Gin Gly Val Asp Val Ala Arg Leu Cys Gin His Gly 
1105 1U0 1115 1120 

Gly Leu Cys Val Asp Ala Gly Asn Thr His His Cys Arg Cys Gin Ala 
7 1125 1130 1135 

Gly Tyr Thr Gly Ser Tyr Cys Glu Asp Leu Val Asp Glu Cys Ser Pro 
1140 1145 1150 

Ser Pro Cys Gin Asn Gly Ala Thr Cys Thr Asp Tyr Leu Gly Gly Tyr 
1155 1160 1165 

Ser Cys Lys Cys Val Ala Gly Tyr His Gly Val Asn Cys Ser Glu Glu 
1170 \ H75 1180 

lie Asp Glu Cys Leu Ser His Pro Cys Gin Asn Gly Gly Thr Cys Leu 
1185 1190 1195 1200 

Asp Leu Pro Asn Thr Tyr Lys Cys Ser Cys Pro Arg Gly Thr Gin Gly 
1205 1210 1215 

Val His Cys Glu He Asn Val Asp Asp Cys Asn Pro Pro Val Asp Pro 
1220 1225 1230 

Val Ser Arg Ser Pro Lys Cys Phe Asn Asn Gly Thr Cys Val Asp Gin 
1235 1240 1245 

Val Gly Gly Tyr Ser Cys Thr Cys Pro Pro Gly Phe Val Gly Glu Arg 
1250 1255 1260 

Cys Glu Gly Asp val Asn Glu Cys Leu Ser Asn Pro Cys Asp Ala Arg 
1265 1270 1275 1280 

Glv Thr Gin Asn Cys Val Gin Arg Val Asn Asp Phe His Cys Glu Cys 
X 1285 1290 1295 

Ara Ala Gly His Thr Gly Arg Arg Cys Glu Ser Val He Asn Gly Cys 
y 1300 1305 1310 

Lvs Gly Lys Pro Cys Lys Asn Gly Gly Thr Cys Ala Val Ala Ser Asn 
1 1315 1320 1325 

Thr Ala Arg Gly Phe He Cys Lys Cys Pro Ala Gly Phe Glu Gly Ala 
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1335 1340 



Thr cya Glu Asn Asp Ala Arg Thr Cys Gly Ser Leu Arg Cys Leu Asn^ 
134S 1350 AJ " 

Gly Gly Thr Cys Il^Ser Oly Pro Arg Ser^Pro Thr Cys Leu Cys^Leu 

Gly Pro Phe Thr Gly Pro Glu Cys Gin Phe Pro Ala Ser Ser Pro Cys 

1380 1385 

Leu Gly Gly Asn Pro Cys Tyr Asn Gin Gly Thr Cys Glu Pro Thr Ser 
2395 1400 x*»U3 

Glu Ser Pro Phe Tyr Arg Cys Leu Cys Pro Ala Lys Phs Asn Gly Leu 

1410 1415 
Leu Cys His lie Leu Asp Tyr Ser Phe Gly Gly Gly Ala Gly Arg Asp Q 
1425 1430 

lie Pro Pro Pro Leu lie Glu Glu Ala Cys Glu Leu Pro Glu Cys Gin 
^445 1450 AH33 

Glu Asp Ala Gly Asn Lys Val Cys Ser Leu Gin Cys Asn Asn His Ala 
1460 1465 

Cys Gly Trp Asp Gly Gly Asp Cys Ser Leu Asn Phe Asn Asp Pro Trp 

1475 1480 Xiao 

Lys Asn Cys Thr Gin Ser Jjj 5 «.» <*■ Tro L * - goo**" S " *** ° lY 



1490 



His Cys Asp Ser Gin Cys Asn Ser Ala Gly Cys Leu Phe Asp Gly Phj 

1505 1510 

Asp Cys Gin Arg Al^Glu Gly Gin Cys Leu T * r AB P 5g s Tyr 

Cys Lys Asp His Phe Ser Asp Gly His Cys Asp Gin Gly Cys Asn Ser 

1540 1545 A33U 

Ala Glu Cys Glu Trp Asp Gly Leu Asp Cys Ala Glu His Val Pro Glu 
1555 1560 

Arg Leu Ala Ala Gly Thr Leu Val Val Val Val Leu Met Pro Pro Glu 
1570 1575 1580 

Gln g Leu Arg Asn Ser SerPhe His Phe Leu Arg^Glu Leu Ser Arg Val Q 

Leu His Thr Asn Val Val Phe Lys Arg Asp Ala His Gly Gin Gin Met 

1605 1510 1013 

■ lie Phe Pro Tyr Tyr Gly Arg Glu Glu Glu Leu Arg Lys His Pro He 

1620 1625 

Lys Arg Ala Ala Glu Gly Trp Ala Ala Pro Asp Ala Leu Leu Gly Gin 

1635 1640 
Val Lys Ala Ser Leu Leu Pro Gly Gly Ser Glu Gly Gly Arg Arg Arg 

1650 I 655 i00U 

Arg Glu Leu Asp Pro Met Asp Val Arg Gly Ser lie Val Tyr Leu ClU Q 
1665 I 670 

lie Asp Asn Arg Gln^Cye Val Gin Ala Server Gin Cys Phe Gl^Ser 
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Ala Thr Asp Val Ala Ala Phe Leu Gly Ala Leu Ala Ser Leu Gly Ser 

1700 1705 i/iu 

Leu Asn He Pro Tyr Lys lie Glu Ala Val Gin Ser Glu Thr Val Glu 
1715 1720 1'25 

Pro Pro Pro Pro Ala Gin Leu Hie Phe Met Tyr Val Ala Ala Ala Ala 
1730 1735 1740 

Phe Val Leu Leu Phe ^Val Gly Cys Gly Val g Leu Leu Ser Arg Lys^ 

Arg Arg Arg Gin Hia Gly Gin Leu Trp Phe Pro Glu Gly Phe Ly^Val 

1765 

Ser Glu Ala Ser Lye Lye Lye Arg Arg Glu Glu Leu Gly Glu Asp Ser 
1780 I 785 1 

Val Gly Leu Lys Pro Leu Lys Asn Ala Ser Asp Gly Ala Leu Met Asp 
1 1795 1800 1805 

Asp Asn Gin Asn Glu Trp Gly Asp Glu Asp Leu Glu Thr Lys Lys Phe 

1810,. 1815 1820 

Arg Phe Glu Glu Pro Val Val Leu Pro Asp Leu Asp Asp Gin Thr Asp 
1825 I 830 1835 

His Arg Gin Trp Thr^Cln Gin His Leu JspAla Ala A8 P Leu Jjg s ,l-fc 

Ser Ala Met Ala Pro Thr Pro Pro Gin Gly Glu Val Asp Ala Asp Cys 
I860 1 86S 1870 

Met Asp Val Asn Val Arg Gly Pro Asp Gly Phe Thr Pro Leu Met lie 
1875 I 880 1885 

Ala Ser Cys Ser Gly Gly Gly Leu Glu Thr Gly Asn Ser Glu Glu Glu 
1890 1895 1900 

Glu Asp Ala Pro Ala Val lie Ser Asp Phe lie Tyr Gin Gly Ala Ser 
1905 I 910 1915 

Leu His Asn Gin Thr Asp Arg Thr Gly Glu Thr Ala Leu His Leu Ala 

1925' 1930 

Ala Arg Tyr Ser Arg Ser Asp Ala Ala Lys Arg Leu Leu Glu Ala Ser 

1940 1945 l*au 

Ala Asp Ala Asn II* Gin Asp Asn Met Gly Arg Thr Pro Leu His Ala 

1955 1960 
Ala Val Ser Ala Asp Ala Gin Gly Val Phe Gin lie Leu He Arg Asn 

1970 I 975 1980 

Arg Ala Thr Asp Leu AspAla Arg Met His As^Gly Thr Thr Pro Le^ 
1985 

He Leu Ala Ala Arg Leu Ala Val Glu Gly Met Leu Glu Asp Leu lie 

2005 Zulu 

Asn ser His Ala Asp Val Asn Ala Va^Asp Asp Leu Gly Lys^Ser Ala 



2020 

Ala Ala Ala Val Asn Asn Val Asp Ala Ala^Val Val Leu 

2035 



Leu His Trp^Ala Ala Ala vai «j£ o * sn ~" 2045 



Leu 



Lys Asn Gly Ala Asn Lys Asp Met Gin Asn Asn Arg Glu Glu Thr 



PCT/US93/09338 

WO 94/07474 



2050 



-134- 

2055 2060 



2070 2075 



2065 

Leu Leu Asp His Phe Ala Asn Arg Asp lie Thr fl 
2085 2090 

Leu Pro Arg Asp lie Ala Gin Glu Arg Met His H 
2100 2105 

Leu Leu Asp Glu Tyr Asn Leu Val Arg Ser Pro G 
2115 ' 2120 

Pro Leu Gly Oly Thr Pro Thr Leu Ser Pro Pro I 
2130 2135 2 

Gly Tyr Leu Gly Ser Leu^Lys Pro Gly Val Gln^C 

Lys Pro Ser Ser Lys Cly Leu Ala Cys Gly Ser I 
1 2165 2170 

Leu Lys Ala Arg Arg Lys Lys Ser Gin Asp Gly I 
2180 2185 

Asp Ser Ser Gly Met Leu Ser Pro Val Asp Ser 1 
2195 2200 



Thr Ala LyB 


Val 
2080 


His Met Asp Arg 
2095 


Aap lie Val 
2110 


Arg 


Leu His Gly 
2125 


Ala 


Cys Ser Pro 


Asn 


Lys Lys Val 


Arg 
2160 


Glu Ala Lys Asp 
2175 


Gly Cys Leu 
2190 


Leu 


Glu Ser Pro 
2205 


His 


Pro Ser Pro 


Phe 



2210 2215 



2220 



Gin Gin Ser Pro Ser Val Pro Leu Asn His Leu Pro Gly Met Pro Asp Q 
2225 2230 "-* ! » 

Thr His Leu Gly He Gly His Leu Asn Val Ala Ala Lys Pro Glu Met 
2245 2250 

Ala Ala Leu Gly Gly Gly Gly Arg Leu Ala Phe Glu Thr Gly Pro Pro 
2260 2265 

Arg Leu Ser His Leu Pro Val Ala Ser Gly Thr Ser Thr Val Leu Gly 

2275 2280 
Ser Ser Ser Gly Gly Ala Leu Asn Phe Thr Val Gly^ly Ser Thr Ser 

2290 2295 
Leu Asn Gly Gin Cys Glu Trp Leu Ser Arg Leu Gin Ser Gly Met Val Q 
2305 2310 231b 

Pro Asn Gin Tyr As^Pro Leu Arg Gly Serial Ala Pro Gly Pr^Leu 

Ser Thr Gin AlaPro Ser Leu Gin Hi^Gly Met Val Gly Pro^eu His 

Ser Ser Leu Ala Ala Ser Ala Leu Ser Gin Met Met Ser Tyr Gin Gly 

2355 2360 
Leu Pro Ser Thr Arg Leu Ala Thr Gin Pro His Leu Val Gin Thr Gin 

2370 2375 
Gin Val Gin Pro Gin Asn^Leu Gin Met Gin Gl^Gln Asn Leu Gin PrO Q 

Ala Asn lie Gin Gln^Gln Gin Ser Leu Gl^Pro Pro Pro Pro Pr^Pro 



2405 
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Gln Pro His Leu Gly Val Ser Ser Ala Ala Ser Gly His Leu Gly Arg 
2420 2425 2430 

Ser Phe Leu Ser Gly Glu Pro Ser Gin Ala Asp Val Gin Pro Leu Gly 
2435 . 2440 2445 

Pro Ser Ser Leu Ala Val His Thr He Leu Pro Gin Glu Ser Pro Ala 
2450 2455 2460 

Leu Pro Thr Ser Leu Pro Ser Ser Leu Val Pro Pro Val Thr Ala Ala 
2465 2470 2475 2481 

Gin Phe Leu Thr Pro Pro Ser Gin His Ser Tyr Ser Ser Pro Val Glu 
2485 2490 2495 

Asn Thr Pro Ser His Gin Leu Gin Val Pro Glu His >co Pne Leu Thr 
2500 2505 "2510 

Pro Ser Pro Glu Ser Pro Asp Gin Trp Ser Ser Ser Ser Pro His Ser 
2515 2520 2525 

Asn Val Ser Asp Trp Ser Glu Gly val Ser Ser Pro Pro Thr Ser Met 
2530 2535 2540 

Gin Ser Gin He Ala Arg He Pro Glu Ala Phe Lys 
2545 2550 2555 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9723 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 



<ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 10.. 7419 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 




48 



96 



144 



192 



240 



288 
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80 85 90 

GGG TTT ACA CCA GAG GAC TGC CAG TAC TCA ACA TCT CAT CCA TGC TTT 336 

Gl^ Phe Thr Gly Clu Asp Cys Gin Tyr Ser Thr Ser His Pro Cys Phe 
' 95 100 105 

GTG TCT CGA CCC TGC CTG AAT GGC GGC ACA TGC CAT ATG CTC AGC CGG 384 

Val Ser Arg Pro Cys Leu Asn Gly Gly Thr Cys His Met Leu Ser Arg 

110 115 120 125 

GAT ACC TAT GAG TGC ACC TGT CAA GTC GGG TTT ACA GGT AAG GAG TGC 432 

Asp Thr Tyr Clu Cys Thr Cys Gin Val Gly Phe Thr Gly Lys Glu Cys 

CAA TGG ACG GAT GCC TGC CTG TCT CAT CCC TGT GCA AAT GGA AGT ACC 480 

Gin Trp Thr Asp Ala Cys Leu Ser His Pro Cys Ala Asn Gly Ser Thr 
145 150 155 

TGT ACC ACT GTG GCC AAC CAG TTC TCC TGC AAA TGC CTC ACA GGC TTC 528 

Cys Thr Thr Val Ala Asn Gin Phe Ser Cys Lys Cys Leu Thr Gly Phe 

165 170 



160 



ACA CGG CAG AAA TGT GAG ACT GAT GTC AAT GAG TGT GAC ATT CCA GGA 576 
Thr Gly Gin Lys Cys Glu Thr Asp Val Asn Glu Cys Asp lie Pro Gly 
- 180 185 



175 



CAC TGC CAG CAT GGT GGC ACC TGC CTC AAC CTG CCT GGT TCC TAC CAG 624 
His Cys Gin His Gly Gly Thr Cys Leu Asn Leu Pro Gly Ser Tyr Gin 
190 195 200 205 



TGC CAG TGC CCT CAG GGC TTC ACA GGC CAG TAC TGT GAC AGC CTG TAT 672 
Cys Gin Cys Pro Gin Gly Phe Thr Gly Gin Tyr Cys Asp Ser Leu Tyr 
210 215 220 

GTG CCC TGT GCA CCC TCA CCT TGT GTC AAT GGA GGC ACC TGT CCC CAG 720 
Val Pro Cys Ala Pro Ser Pro Cys Val Asn Gly Gly Thr Cys Arg Gin 
225 230 235 

ACT GGT GAC TTC ACT TTT GAG TCC AAC TGC CTT CCA GGT TTT GAA GGG 768 
Thr Gly Asp Phe Thr Phe Glu Cys Asn Cys Leu Pro Gly Phe Glu Gly 
240 245 250 

AGC ACC TGT CAG AGO AAT ATT GAT GAC TGC CCT AAC CAC AGG TGT CAG 816 
Ser Thr Cys Clu Arg Asn ile Asp Asp Cys Pro Asn His Arg Cys Gin 
255 260 265 

AAT GGA GGG GTT TGT GTG GAT GGG GTC AAC ACT TAC AAC TCC CGC TGT 864 
Asn Gly Gly Val Cys Val Asp Gly Val Asn Thr Tyr Asn Cys Arg Cys 
270 275 280 285 



CCC CCA CAA TGG ACA GGA CAG TTC TGC ACA GAG GAT GTG GAT GAA TGC 912 
Pro Pro Gin Trp Thr Gly Gin Phe Cys Thr Glu Asp Val Asp Glu Cys 

CTG CTG CAG CCC AAT GCC TGT CAA AAT GGG GGC ACC TGT GCC AAC CGC 960 
Leu Leu Gin Pro Asn Ala Cys Gin Asn Gly Gly Thr Cys Ala Asn Arg 
305 31° 315 

AAT GGA GGC TAT GGC TGT GTA TGT GTC AAC GGC TGG AGT GGA GAT GAC 1008 
Asn Gly Gly Tyr Gly Cys Val Cys Val Asn Gly Trp Ser Gly Asp Asp 
320 325 330 

TGC AGT GAG AAC ATT GAT GAT TGT GCC TTC GCC TCC TGT ACT CCA GGC 1056 
cys Ser Glu Asn lie Asp Asp Cys Ala Phe Ala Ser Cys Thr Pro Gly 
335 340 345 

TCC ACC TGC ATC GAC CGT GTG GCC TCC TTC TCT TGC ATG TGC CCA GAG 1104 
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sor Thr Cys lie Asp Arg Val Ala Ser Phe Ser Cys Met Cys Pro Glu 
350 355 360 365 

GGG AAG GCA GGT CTC CTG TGT CAT CTG GAT GAT GCA TGC ATC AGC AAT 1152 
Glv Lvs Ala Gly L u Leu Cys His Leu Asp Asp Ala Cys lie Ser Asn 
17 370 375 380 

OCT TGC CAC AAG GGG GCA CTG TGT GAC ACC AAC CCC CTA AAT GGG CAA 1200 
Pro Cys His Lys Gly Ala Leu Cys Asp Thr Asn Pro Leu Asn Gly Gin 
385 390 395 

TAT ATT TGC ACC TGC CCA CAA GGC TAC AAA GGG GCT GAC TGC ACA CAA 1248 
Tvr lie Cys Thr Cys Pro' Gin Gly Tyr Lys Gly Ala Asp Cys Thr Glu 
400 405 410 

GAT CTG GAT GAA TGT GCC ATG GCC AAT AGC AAT CCT TGT GAG CAT GCA 1296 
Asp Val Asp Glu Cys Ala Met Ala Asn Ser Asn Pro Cys Glu His Ala 
415 420 425 

GCA AAA TGT GTG AAC ACG GAT GGC GCC TTC CAC TGT GAG TGT CTG AAG 1344 
Glv Lys Cys Val Asn Thr Asp Gly Ala Phe His Cys Glu Cys Leu Lys 
430 435 440 445 

GGT TAT GCA GGA CCT CGT TGT GAG ATG GAC ATC AAT GAG TGC CAT TCA 1392 
Glv Tvr Ala Gly Pro Arg Cys Glu 'at Asp lie Asn Glu Cys His Ser 
11 450 455 460 

GAC CCC TGC CAG AAT GAT GCT ACC TGT CTG GAT AAG ATT GGA GGC TTC 1440 
Asp Pro Cys Gin Asn Asp Ala Thr Cys Leu Asp Lys lie Gly Gly Phe 
* 465 470 475 

ACA TGT CTG TGC ATG CCA GCT TTC AAA GGT GTG CAT TGT GAA TTA GAA 1488 
Thr Cys Leu Cys Met Pro Gly Phe Lys Gly Val His Cys Glu Leu Glu 
480 * 485 490 

ATA AAT GAA TGT CAG AGC AAC CCT TGT GTG AAC AAT GGG CAG TGT GTG 1536 
lie Asn Glu Cys Gin Ser Asn Pro Cys Val Asn Asn Gly Gin Cys Val 
495 500 505 

GAT AAA GTC AAT CGT TTC CAG TGC CTG TGT CCT CCT GGT TTC ACT GGG 1584 
Asp Lys Val Asn Arg Phe Gin Cys Leu Cys Pro Pro Gly Phe Thr Gly 
510 515 520 525 

CCA GTT TGC CAG ATT GAT ATT GAT GAC TGT TCC ACT ACT CCG TGT CTG 1632 
Pro Val Cys Gin He Asp He Asp Asp Cys Ser Ser Thr Pro Cys Leu 
530 535 540 

AAT GGG GCA AAG TGT ATC GAT CAC CCG AAT GGC TAT GAA TGC CAG TGT 1680 
Asn Gly Ala Lys Cys He Asp His Pro Asn Gly Tyr Glu Cys Gin Cys 
545 550 555 

GCC ACA GGT TTC ACT GGT GTG TTG TGT GAG GAG AAC ATT GAC AAC TGT 1728 
Ala Thr Gly Phe Thr Gly Val Leu Cys Glu Glu asn He Asp Asn Cys 
560 565 570 

GAC CCC GAT CCT TGC CAC CAT GGT CAG TGT CAG GAT GGT ATT GAT TCC 1776 
Asp Pro Asp Pro Cys His His Gly Gin Cys Gin Asp Gly He Asp Ser 
575 580 585 

TAC ACC TGC ATC TGC AAT CCC GGG TAC ATG GGC GCC ATC TGC AGT GAC 1824 
Tvr Thr Cys He Cys Asn Pro Gly Tyr Met Gly Ala He Cys Ser Asp 
590 595 600 605 

CAG ATT GAT GAA TGT TAC AGC AGC CCT TGC CTG AAC GAT GGT CGC TGC 1872 
Gin He Asp Glu Cys Tyr Ser Ser Pro Cys Leu Asn Asp Gly Arg Cya 
610 615 620 
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ATT GAC CTG GTC AAT GGC TAC CAG TGC AAC TGC CAG CCA GGC ACG TCA 1920 
lie Asp L u Val Asn Gly Tyr Gin Cys Asn Cys Gin Pro Gly Thr Ser 
625 630 635 

GGG GTT AAT TGT GAA ATT AAT TTT GAT GAC TGT GCA AGT AAC CCT TGT 1968 
Glv Val Asn Cys Glu lie Asn Phe Asp Asp Cys Ala ser Asn Pro Cys 
640 645 6S0 

ATC CAT GGA ATC TGT ATG GAT GGC ATT AAT CGC TAC AGT TGT GTC TGC 2016 
lie His Gly He cys Met Asp Gly He Asn Arg Tyr Ser Cys Val Cys 
655 660 665 

TCA CCA GGA TTC ACA GGG 'CAG AGA TGT AAC ATT GAC ATT GAT GAG TGT 2064 
ser Pro Gly Phe Thr Gly Gin Arg Cys Asn He Asp lie Asp Glu Cys 
670 675 680 685 

GCC TCC AAT CCC TGT CGC AAG GGT GCA ACA TGT ATC AAC GGT GTG AAT 2112 
Ala Ser Asn Pro Cys Arg Lys Gly Ala Thr Cys He Asn Gly Val Asn 
690 695 700 

GGT TTC CGC TGT ATA TGC CCC GAG GGA CCC CAT CAC CCC AGC TGC TAC 2160 
Gly Phe Arg Cys He Cys Pro Glu Gly Pro His His Pro Ser Cys Tyr 
705 710 715 

TCA CAG GTG AAC GAA TGC CTG AGC AAT CCC TGC ATC CAT GGA AAC TGT 2208 
Ser Gin Val Asn Glu Cys Leu Ser Asn Pro Cys He His Gly Asn Cys 
720 725 730 

ACT GGA GGT CTC AGT GGA TAT AAG TGT CTC TGT GAT GCA GGC TGO GTT 2256 
Thr Gly Gly Leu Ser Gly Tyr Lys Cys Leu Cys Asp Ala Gly Trp Val 
735 740 745 

GGC ATC AAC TGT GAA GTG GAC AAA AAT GAA TGC CTT TCG AAT CCA TGC 2304 
Gly lie Asn Cys Glu Val Asp Lye Asn Glu Cys Leu Ser Asn Pro Cys 
750 755 760 765 

CAG AAT GGA GGA ACT TGT GAC AAT CTG GTG AAT GGA TAC AGG TGT ACT 2352 
Gin Asn Gly Gly Thr Cya Asp Asn Leu Val Asn Gly Tyr Arg Cys Thr 
2 770 775 780 

TGC AAG AAG GGC TTT AAA GGC TAT AAC TGC CAG GTG AAT ATT GAT GAA 2400 
Cvs Lvs Lys Gly Phe Lys Gly Tyr Asn Cys Gin Val Asn He Asp Glu 
1 785 790 795 

TGT GCC TCA AAT CCA TGC CTG AAC CAA GGA ACC TGC TTT GAT GAC ATA 2448 
Cvs Ala Ser Asn Pro Cys Leu Asn Gin Gly Thr Cys Phe Asp Asp He 
* 800 805 810 

AGT GGC TAC ACT TGC CAC TGT GTG CTG CCA TAC ACA GGC AAG AAT TGT 2496 
Ser Gly Tyr Thr Cys His Cys Val Leu Pro Tyr Thr Gly Lys Asn Cys 
815 820 825 

CAG ACA GTA TTG OCT CCC TGT TCC CCA AAC CCT TGT GAG AAT GCT GCT 2544 
Gin Thr Val Leu Ala Pro Cys Ser Pro Asn Pro Cys Glu Asn Ala Ala 
830 B35 840 845 

GTT TGC AAA GAG TCA CCA AAT TTT GAG AGT TAT ACT TGC TTG TGT GCT 2592 
Val Cvs Lys Glu Ser Pro Asn Phe Glu Ser Tyr Thr Cys Leu Cys Ala 
y 7 850 855 860 

CCT GGC TGG CAA GGT CAG CGG TGT ACC ATT GAC ATT GAC GAG TGT ATC 2640 
Pro Gly Trp Gin Gly Gin Arg Cys Thr He Asp He Asp Glu Cys He 
865 870 875 

TCC AAG CCC TGC ATG AAC CAT GGT CTC TGC CAT AAC ACC CAG GGC AGC 2688 
Ser Lvs Pro Cys Met Asn His Gly Leu Cys His Asn Thr Gin Gly Ser 
y 880 885 890 
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TAC ATG TGT GAA TGT CCA CCA GGC TTC AGT GGT ATG GAC TGT GAG GAG 

Tyr Met Cya Glu Cys Pro Pro Gly Phe Ser Gly Met Asp Cys Glu Glu 
895 900 905 

G?0 ATT GAT GAC TGC CTT GCC AAT CCT TGC CAG AAT GGA GGT TCC TGT 
At,p lie Asp Asp Cys Leu Ala Asn Pro Cys Gin Asn Gly Gly Ser Cys 
910 915 920 925 

ATG GAT GGA GTG AAT ACT TTC TCC TGC CTC TGC CTT CCG GGT TTC ACT 
Met Asp Gly Val Asn Thr Phe Ser Cys Leu Cys Leu Pro Gly Phe Thr 
930 935 940 



2736 



2784 



2832 



GGG GAT AAG TGC CAG ACA GAC ATG AAT GAG TGT CTG AGT GAA CCC TGT 
Gly Asp Lys Cys Gin Thr 'Asp Met Asn Glu Cys Leu Ser Glu Pro Cys 
945 950 955 



2880 



AAG AAT GGA GGG ACC TGC TCT GAC TAC GTC AAC AGT TAC ACT TGC AAG 
Lys Asn Gly Gly Thr Cys Ser Asp Tyr Val Asn Ser Tyr Thr Cys Lys 
960 965 970 



2928 



TGC CAG GCA GGA TTT GAT GGA GTC CAT TGT GAG AAC AAC ATC AAT GAG 
Cys Gin Ala Gly Phe Asp Gly Val His Cys Glu Asn Asn lie Asn Glu 
975 980 985 



2976 



TGC ACT GAG AGC TCC TGT TTC AAT GGT GGC ACA TGT GTT GAT GGG ATT 
Cys Thr Glu Ser Ser Cys Phe Asn Gly Gly Thr Cys Val Asp Gly He 
990 995 1000 1005 



3024 



AAC TCC TTC TCT TGC TTG TGC CCT GTG GGT TTC ACT GGA TCC TTC TGC 
Asn Ser Phe Ser Cys Leu Cys Pro Val Gly Phe Thr Gly Ser Phe Cys 
1010 1015 1020 



3072 



CTC CAT GAG ATC AAT GAA TGC AGC TCT CAT CCA TGC CTG AAT GAG GGA 
Leu His Glu He Asn Glu Cys Ser Ser His Pro Cys Leu Asn Glu Gly 
1025 1030 1035 



3120 



ACG TGT GTT GAT GGC CTG GGT ACC TAC CGC TGC AGC TGC CCC CTG GGC 
Thr Cys Val Asp Gly Leu Gly Thr Tyr Arg Cys Ser Cys Pro Leu Gly 
1040 1045 1050 



3168 



TAC ACT GGG AAA AAC TGT CAG ACC CTG GTG AAT CTC TGC AGT CGG TCT 
Tyr Thr Gly Lys Asn Cys Gin Thr Leu Val Asn Leu Cys Ser Arg ser 
1055 1060 1065 



3216 



CCA TGT AAA AAC AAA GGT ACT TGT GTT CAG AAA AAA GCA GAG TCC CAG 
Pro Cys Lys Asn Lys Gly Thr Cys Val Gin Lys Lys Ala Glu Ser Gin 
1070 1075 1080 1085 



3264 



TGC CTA TGT CCA TCT GGA TGG GCT GGT GCC TAT TGT GAC GTG CCC AAT 
Cys Leu Cys Pro Ser Gly Trp Ala Gly Ala Tyr Cys Asp Val Pro Asn 
1090 1095 1100 



3312 



CTC TCT TGT GAC ATA GCA GCC TCC AGG AGA GGT GTG CTT GTT GAA CAC 
Val Ser Cys Asp He Ala Ala Ser Arg Arg Gly Val Leu Val Glu His 
1105 1110 1115 



3360 



TTG TGC CAG CAC TCA GGT GTC TGC ATC AAT GCT GGC AAC ACG CAT TAC 3408 
Leu Cys Gin His Ser Gly Val Cys He Asn Ala Gly Asn Thr His Tyr 
1120 1X25 1130 

TGT CAG TGC CCC CTG GGC TAT ACT CGG AGC TAC TGT GAG GAG CAA CTC 3456 
Cys Gin Cys Pro Leu Gly Tyr Thr Gly Ser Tyr Cys Glu Glu Gin Leu 
1135 H40 1145 

GAT GAG TGT GCG TCC AAC CCC TCC CAG CAC GGG GCA ACA TGC AGT GAC 3504 
Asp Glu Cys Ala Ser Asn Pro Cys Gin His Gly Ala Thr Cys Ser Asp 
1150 H55 1160 1165 
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„_ „. acA TGC ga g TCT GTC CCA GGC TAT CAG GGT GTC 

s s s? ~ t?c s ss as **• ~« «v «. «" •a B «i 

rAG TAT GAA GTG GAT GAG TGC CAG AAT CAG CCC TGC CAG AAT 
iK-SS Si! SJ c?J !2 A3 P Glu Cys Gin Asn Gin Pro Cys Gin Asn 
1185 1190 

GGA GGC ACC TGT ATT GAC CTT GTG AAC CAT TTC AAG TGC TCT TGC CCA 
SlJ 2? ?Sr Cys Urn Asp Leu Val Asn His Phe Lys Cys Ser Cys Pro 
1200 1205 

GGC »ct CGC GGC CTA CTC TGT GAA GAG AAC ATT GAT GAC TGT GCC 
Ity Tte 2? Leu U Cys Glu Glu Asn lie Asp Asp Cys Ala 
1215 1220 

ccc CGT CCC CAT TGC CTT AAT GGT GGT CAG TGC ATG GAT AGG 1 ATT GGA 

£g SS So £E C?s Leu Asn Gly Gly Gin Cys Met Asp Arg lie Gly 
12 Io 1235 I 240 

GGC TAC AGT TGT CGC TGC TTG CCT GGC TTT GCT GGG GAG CGT TGT GAG 

25 5r ill Cys ArgCys Leu Pro Gly Ph^Ala Gly Glu Arg Cy^Glu 

r«A GAC ATC AAC GAG TGC CTC TCC AAC CCC TGC AGC TCT GAG GGC AGC 
8} Asp XU AsnSu lys Leu Ser MnP«o Cys Ser Ser QluOly ser 

CTC GAC TGT ATA CAG CTC ACC AAT GAC TAC CTG TGT GTT TGC CGT AGT 
Sp c?s lie Gin Leu Thr Asn Asp Tyr Leu Cys Val Cys Arg Ser 
1280 I 285 1290 

GCC TTT ACT GGC CGG CAC TGT GAA ACC TTC GTC GAT GTG TGT CCC CAG 
IS Se 2y Arg His Cys Glu Thr Phe Val Asp Val Cys Pro Gin 
1295 13°° 1305 

ATG CCC TGC CTG AAT GGA GGG ACT TGT GCT GTG GCC AGT AAC ATG CCT 
ill Ho ?£ LeS Asn Gly Gly Thr Cys Ala Val Ala Ser Asn Met Pro 
1310 1315 ' LJ 

rA" GGT TTC ATT TGC CGT TGT CCC CCG GGA TTT TCC GGG GCA AGG TGC 
S P 2? HI lH CyS Q Arg Cys Pro Pro olyW* Ser Gly Ala Arg^Cys 

CAG AGC AGC TGT GGA CAA GTG AAA TGT AGG AAG GGG GAG CAG TGT GTG 
III ier Ser eye Gly Gin val Lys Cys Arg Lys Gly ciu Gin Cys Val 
1345 1350 

CAC ACC GCC TCT GGA CCC CGC TGC TTC TGC CCC AGT CCC CGG GAC TGC 
nil llr Sa sVr Gly Pro Arg Cys Phe Cys Pro Ser Pro Arg Asp Cys 
1360 1365 



AGC TGC CAC 
Ser Cys His 



CAC TCA GGC TGT GCC AGT AGC CCC TGC CAG CAC GGG GGC 
til sir 2? lyl Ala Ser Ser Pro Cys Gin His Gly Gly 
1375 1380 138 

CCT CAG CGC CAG CCT CCT TAT TAC TCC TGC CAG TGT GCC 
Pro III Arg Gin Pro Pro Tyr Tyr Ser Cys Gin Cys Ala 
1390 1395 

TCG GGT AGC CGC TGT GAA CTC TAC ACG GCA CCC CCC AGC 
sS 2y Ser Arg Cys Glu Leu Tyr Thr Ala Pro Pro Ser 
1410 1415 

GCC ACC TGT CTG AGC CAG TAT TGT GCC GAC AAA GCT CGG GAT GGC GTC 
S «£ ™ £ Ser Gin Tyr Cys Ala Asp Lys Ala Arg Asp Gly Val 

1425 1430 



CCA CCA TTC 
Pro Pro Phe 
1405 

ACC CCT CCT 
Thr Pro Pro 
1420 



3552 



3600 



3648 



3696 



3744 



3792 



3840 



3888 



3936 



3984 



4032 



4080 



4128 



4176 



4224 



4272 



4320 
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TGT GAT GAG GCC TGC AAC AGC CAT GCC TGC CAG TGG GAT GGG GGT GAG 
Cys Asp Glu Ala Cys Asn Ser His Ala Cys Gin Trp Asp Gly Gly Asp 
J 1440 1445 1450 

TGT TCT CTC ACC ATG GAG AAC CCC TGG GCC AAC TGC TCC TCC CCA CTT 
Cys Ser Leu Thr Met Glu Asn Pro Trp Ala Asn Cys Ser Ser Pro Leu 
1455 1460 1465 

CCC TGC TGG GAT TAT ATC AAC AAC CAG TGT GAT GAG CTG TGC AAC ACG 
Pro Cys Trp Asp Tyr He Asn Asn Gin Cys Asp Glu Leu Cys Asn Thr 
1470 1475 1480 1485 

GTC GAG TGC CTG TTT GAC AAC TTT GAA TGC CAG GGG AAC AGC AAG ACA 
Val Glu Cys Leu Phe Asp Asn Phe Glu Cys Gin Gly Asn Ser Lys Thr 
1490 1495 1500 

TGC AAG TAT GAC AAA TAC TGT GCA GAC CAC TTC AAA GAC AAC CAC TGT 
Cvs Lvs Tvr Asp Lys Tyr Cys Ala Asp His Phe Lys Asp Asn His Cys 
1505 1510 1515 

AAC CAG GGG TGC AAC AGT GAG GAG TGT GGT TGG GAT GGG CTG GAC TGT 
Asn Gin Gly Cys Asn Ser Glu Glu Cys Gly Trp Asp Gly Leu Asp Cys 
1520 1525 1530 

OCT GCT GAC CAA CCT GAG AAC CTG GCA GAA GGT ACC CTG GTT ATT GTG 
Ala Ala Asp Gin Pro Glu Asn Leu Ala Glu Gly Thr Leu Val He Val 
1535 1540 1545 

GTA TTQ ATG CCA CCT GAA CAA CTG CTC. CAG GAT GCT CGC AGC TTC TTG 
Val Leu Met Pro Pro Glu Gin Leu Leu Gin Asp Ala Arg Ser Phe Leu 
1550 1555 1560 1565 

CGG GCA CTG GGT ACC CTG CTC CAC ACC AAC CTG CGC ATT AAG CGC GAC 
Arg Ala Leu Gly Thr Leu Leu His Thr Asn Leu Arg He Lys Arg Asp 
1570 1575 1580 

TCC CAG GGG GAA CTC ATG GTG TAC CCC TAT TAT GGT GAG AAG TCA GCT 
Ser Gin Gly Glu Leu Met Val Tyr Pro Tyr Tyr Gly Glu Lys Ser Ala 
1585 1590 1595 

GCT ATG AAG AAA CAG AGG ATG ACA CGC AGA TCC CTT CCT GGT GAA CAA 
Ala Met Lys Lys Gin Arg Met Thr Arg Arg Ser Leu Pro Gly Glu Gin 
1600 1605 1610 

GAA CAG GAG GTG GCT GGC TCT AAA GTC TTT CTG GAA ATT GAC AAC CGC 
Glu Gin Glu Val Ala Gly Ser Lys "al Phe Leu Glu He Asp Asn Arg 
1615 1620 i625 

CAG TGT GTT CAA GAC TCA GAC CAC TGC TTC AAG AAC ACG GAT GCA GCA 
Gin Cys Val Gin Asp Ser Asp His Cys Phe Lys Asn Thr Asp Ala Ala 
1630 1635 1640 1645 

GCA GCT CTC CTG GCC TCT CAC GCC ATA CAG GGG ACC CTG TCA TAC CCT 
Ala Ala Leu Leu Ala Ser His Ala He Gin Gly Thr Leu Ser Tyr Pro 
1650 1655 1660 

CTT GTG TCT GTC GTC AGT GAA TCC CTG ACT CCA GAA CGC ACT CAG CTC 
Leu Val Ser Val Val Ser Glu Ser Leu Thr Pro Glu Arg Thr Gin Leu 
1665 1670 1675 

CTC TAT CTC CTT GCT GTT GCT GTT GTC ATC ATT CTG TTT ATT ATT CTG 
Leu Tyr Leu Leu Ala Val Ala Val Val He He Leu Phe He He Leu 



1680 



1685 



1690 



CTG GGG GTA ATC ATG GCA AAA CGA AAG CGT AAG CAT GGC TCT CTC TGG 
Leu Gly Val He Met Ala Lys Arg Lys Arg Lys His Gly Ser Leu Trp 
1695 1700 1705 



4368 



4416 



4464 



4512 



4560 



4608 



4656 



4704 



4752 



4800 



4848 



4896 



4944 



4992 



5040 



5088 



5136 
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CTG CCT GAA GGT TTC ACT CTT CGC CGA GAT GCA AGC AAT CAC AAG CGT 
Leu Pro Glu Gly Phe Thr Leu Arg Arg Asp Ala Ser Asn His Lys Arg 
1710 1715 1720 1725 



5184 



CGT GAG CCA GTG GGA CAG GAT GCT GTG GGG CTG AAA AAT CTC TCA GTG 
Arg Glu Pro Val Gly Gin Asp Ala Val Gly Leu Lys Asn Leu Ser Val 
1730 1735 1740 



5232 



CAA GTC TCA GAA GCT AAC CTA ATT GGT ACT GGA ACA AGT GAA CAC TGG 
Gin Val Ser Glu Ala Asn Leu lie Gly Thr Gly Thr Ser Glu His Trp 
1745 1750 1755 



5280 



GTC GAT GAT GAA GGG CCC CAG CCA AAG AAA GTA AAG GCT GAA GAT GAG 
Val Asp Asp Glu Gly Pro Gin Pro Lys Lys Val Lys Ala Glu Asp Glu 
1760 1765 1770 



5328 



GCC TTA CTC TCA GAA GAA GAT GAC CCC ATT GAT CGA CGG CCA TGG ACA 
Ala Leu Leu Ser Glu Glu Asp Asp Pro lie Asp Arg Arg Pro Trp Thr 
1775 1780 1785 



5376 



CAG CAG CAC CTT GAA GCT GCA GAC ATC CGT AGG ACA CCA TCG CTG GCT 
Gin Gin His Leu Glu Ala Ala Asp lie Aro Arg Thr Pro Ser Leu Ala 
1790 1795 1800 1805 



5424 



CTC ACC CCT CCT CAG GCA GAG CAG 'JAG GTG GAT GTG TTA GAT GTG AAT 
Leu Thr Pro Pro Gin Ala Glu Gin ilu Val Asp Val Leu Asp Val Asn 
1810 1815 1820 



5472 



GTC CGT GGC CCA GAT GGC TGC ACC CCA TTG ATG TTG GCT TCT CTC CGA 
Val Arg Gly Pro Asp Gly Cys Thr Pro Leu Met Leu Ala Ser Leu Arg 
1825 1830 1835 



5520 



GGA GGC AGC TCA GAT TTG AGT GAT GAA GAT GAA GAT GCA GAG GAC TCT 
Gly Gly Ser ser Asp Leu Ser Asp Glu Asp Glu Asp Ala Glu Asp Ser 
1840 1845 1850 



5568 



TCT GCT AAC ATC ATC ACA GAC TTG GTC TAG CAG GGT GCC AGC CTC CAG 
Ser Ala Asn lie lie Thr Asp Leu Val Tyr Gin Gly Ala Ser Leu Gin 
1855 1860 1865 



5616 



GCC CAG ACA GAC CGG ACT GGT GAG ATG GCC CTG CAC CTT GCA GCC CGC 
Ala Gin Thr Asp Arg Thr Gly Glu Met Ala Leu His Leu Ala Ala Arg 
1870 1875 1880 1885 



5664 



TAG TCA CGG GCT GAT GCT GCC AAG CGT CTC CTG GAT GCA GGT GCA GAT 
Tyr Ser Arg Ala Asp Ala Ala Lys Arg Leu Leu Asp Ala Gly Ala Asp 
1890 1895 1900 



5712 



GCC AAT GCC CAG GAC AAC ATG GGC CGC TGT CCA CTC CAT GCT GCA GTG 
Ala Asn Ala Gin Asp Asn Met Gly Arg Cys Pro Leu His Ala Ala Val 
1905 1910 1915 



5760 



GCA GCT GAT GCC CAA GGT GTC TTC CAG ATT CTG ATT CGC AAC CGA GTA 
Ala Ala Asp Ala Gin Gly Val Phe Gin lie Leu lie Arg Asn Arg Val 
1920 1925 1930 



5808 



ACT GAT CTA GAT GCC AGG ATG AAT GAT GGT ACT ACA CCC CTG ATC CTG 
Thr Asp Leu Asp Ala Arg Met Asn Asp Gly Thr Thr Pro Leu lie Leu 
1935 1940 1945 



5856 



GCT GCC CGC CTG GCT GTG GAG GGA ATG GTG GCA GAA CTG ATC AAC TGC 
Ala Ala Arg Leu Ala Val Glu Gly Met Val Ala Glu Leu He Asn Cys 
1950 1955 1960 1965 



5904 



CAA GCG GAT GTG AAT GCA GTG GAT GAC CAT GGA AAA TCT GCT CTT CAC 
Gin Ala Asp Val Asn Ala Val Asp Asp His Gly Lys Ser Ala Leu His 
1970 1975 1980 



5952 
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TGO GCA GCT OCT GTC AAT AAT GTG GAG GCA ACT CTT TTG TTG TTG AAA 6000 
Tro Ala Ala Ala Val Asn Asn Val Glu Ala Thr Leu Leu Leu Leu Lys 
1985 1990 1995 

AAT GGG GCC AAC CGA GAC ATG CAG GAC AAC AAG GAA GAG ACA CCT CTG 6048 
Asn Gly Ala Asn Arg Asp Met Gin Asp Asn Lya Glu Glu Thr Pro L u 
2000 2005 2010 

TTT CTT GCT GCC COG GAG GGG AGC TAT GAA GCA GCC AAG ATC CTG TTA 6096 
Phe Leu Ala Ala Arg Glu Gly Ser Tyr Glu Ala Ala Lya He Leu Leu 
2015 2020 2025 

GAC CAT TTT GCC AAT CGA GAC ATC ACA GAC CAT ATG GAT CGT CTT CCC 6144 
Asp Hia Phe Ala Asn Arg' Asp lie Thr Asp His Met Asp Arg Leu Pro 
2030 2035 2040 2045 

CGG GAT GTG GCT CGG GAT CGC ATG CAC CAT GAC ATT GTG CGC CTT CTG 6192 
Arg Asp Val Ala Arg Asp Arg Met His His Asp He Val Arg Leu Leu 
2050 2055 2060 

GAT GAA TAC AAT GTG ACC CCA AGC CCT CCA GGC ACC GTG TTG ACT TCT 6240 
Asp Glu Tyr Asn Val Thr Pro Ser Pro Pro Gly Thr val Leu Thr Ser 
* 1 2065 2070 2075 

GCT CTC TCA CCT GTC ATC TGT GGG CCC AAC AGA TCT TTC CTC AGC CTG 6288 
Ala Leu Ser Pro Val He Cys Gly Pro Asn Arg Ser Phe Leu Ser Leu 
2080 2085 2090 

AAG CAC ACC CCA ATG GGC AAG AAG TCT AGA CGG CCC AGT GCC AAG AGT 6336 
Lve His Thr Pro Met Gly Lys Lys Ser Arg Arg Pro Ser Ala Lys Ser 
2095 2100 2105 

ACC ATG CCT ACT AGC CTC CCT AAC CTT GCC AAG GAG GCA AAG GAT GCC 6384 
Thr Met Pro Thr Ser Leu Pro Asn Leu Ala Lys Glu Ala Lys Asp Ala 
2110 2115 2120 2125 

AAG GGT AGT AGG AGG AAG AAG TCT CTG AGT GAG AAG GTC CAA CTG TCT 6432 
Lys Gly Ser Arg A**, 1 *" Lys Ser Leu Ills 01 " LyS V&1 2140 S * r 

GAG AGT TCA GTA ACT TTA TCC CCT GTT GAT TCC CTA GAA TCT CCT CAC 6480 
Glu Ser Ser Val Thr Leu Ser Pro Val Asp Ser Leu Glu Ser Pro His 
2145 2150 2155 

ACG TAT GTT TCC GAC ACC ACA TCC TCT CCA ATC ATT ACA TCC CCT GGG 6528 
Thr Tvr Val Ser Asp Thr Thr Ser Ser Pro Met He Thr Ser Pro Gly 
* 2160 2165 2170 

ATC TTA CAG GCC TCA CCC AAC CCT ATG TTG GCC ACT GCC GCC CCT CCT 6576 
lie Leu Gin Ala Ser Pro Asn Pro Met Leu Ala Thr Ala Ala Pro Pro 
2175 2180 2 J 85 

GCC CCA GTC CAT GCC CAG CAT GCA CTA TCT TTT TCT AAC CTT CAT GAA 6624 
Ala Pro Val His Ala Gin His Ala Leu Ser Phe Ser Asn Leu His Glu 
2190 2195 2200 2205 

ATG CAG CCT TTG GCA CAT GGG GCC AGC ACT GTG CTT CCC TCA GTG AGC 6672 
Met Gin Pro Leu Ala His Gly Ala Ser Thr val Leu Pro Ser Val Ser 
2210 2215 2220 

CAG TTG CTA TCC CAC CAC CAC ATT GTG TCT CCA GGC AGT GGC AGT GCT 6720 
Gin Leu Leu Ser His His His He Val Ser Pro Gly Ser Gly Ser Ala 
2225 2230 2235 

GGA AGC TTG AGT AGG CTC CAT CCA GTC CCA GTC CCA GCA GAT TGG ATG 6768 
G?i Ser Leu Ser Arg Leu His Pro Val Pro Val Pro Ala Asp Trp Met 
2240 2245 2250 
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AAC CGC ATG GAG GTG AAT GAG ACC CAG TAC AAT GAG ATG TTT GGT ATG 6816 
Asn Arg Met Glu Val Asn Glu Thr Gin Tyr Asn Glu Met Phe Gly Met 
2255 2260 2265 

GTC CTG GCT CCA GCT GAG GGC ACC CAT CCT GCC ATA GCT CCC CAG AGC 6864 
Val Leu Ala Pro Ala Glu Gly Thr His Pro Gly lie Ala Pro Gin Ser 
2270 2275 2280 2285 

AGG CCA CCT GAA CGG AAG CAC ATA ACC ACC CCT CGG GAG CCC TTG CCC 6912 
Arg Pro Pro Glu Gly Lys His lie Thr Thr Pro Arg Glu Pro Leu Pro 
2290 2295 2300 

CCC ATT GTG ACT TTC CAG CTC ATC CCT AAA GGC AGT ATT GCC CAA CCA 6960 
Pro lie Val Thr Phe Gin' Leu lie Pro Lys Gly Ser lie Ala Gin Pro 
2305' 2310 2315 

GCG GGG GCT CCC CAG CCT CAG TCC ACC TGC CCT CCA GCT GTT GCG GGC 7008 
Ala Gly Ala Pro Gin Pro Gin Ser Thr Cys Pro Pro Ala Val Ala Gly 
2320 2325 2330 



CCC 
Pro 


CTG CCC 

Leu Pro 
2335 


ACC 
Thr 


ATG 

Met 


TAC CAG ATT 
Tyr Gin He 
2340 


CCA 

Pro 


GAA 
Glu 


ATG GCC CGT 

Met Ala Arg 
2345 


TTG 
Leu 


CCC 
Pro 


AGT 
Ser 


7056 


GTG GCT 
Val Ala 
2350 


TTC 
Phe 


CCC 
Pro 


ACT 
Thr 


GCC ATG 
Ala Met 
2355 


ATG 
Met 


CCC CAG CAG GAC 
Pro Gin Gin Asp 
2360 


GGG 
Gly 


CAG 
Gin 


GTA 
Val 


GCT 
Ala 
2365 


7104 


CAG 
Gin 


ACC 
Thr 


ATT 
He 


CTC 

Leu 


CCA GCC 
Pro Ala 
2370 


TAT 
Tyr 


CAT 
His 


CCT 
Pro 


TTC CCA 
Phe Pro 
2375 


GCC 
Ala 


TCT 
Ser 


GTG GGC AAG 
Val Gly Lys 
2380 


7152 


TAC 
Tyr 


CCC 
Pro 


ACA 
Thr 


CCC CCT 
Pro Pro 
2365 


TCA 
Ser 


CAG 
Gin 


CAC 
His 


AGT TAT 
Ser Tyr 
2390 


GCT 
Ala 


TCC 

Ser 


TCA 

Ser 


AAT GCT 
Asn Ala 
2395 


GCT 
Ala 


7200 


GAG 
Glu 


CGA 
Arg 


ACA CCC 
Thr Pro 
2400 


AGT 
Ser 


CAC 
His 


AGT 
Ser 


GGT CAC 
Gly His 
2405 


CTC 
Leu 


CAG 
Gin 


GGT 
Gly 


GAG CAT 
Glu His 
2410 


CCC 
Pro 


TAC 
Tyr 


7248 


CTG 
Leu 


ACA CCA 
Thr Pro 
2415 


TCC 
Ser 


CCA 
Pro 


GAG 
Glu 


TCT CCT 
Ser Pro 
2420 


GAC 
Asp 


CAG 
Gin 


TGG 
Trp 


TCA AGT 
Ser Ser 
2425 


TCA 
Ser 


TCA 
Ser 


CCC 
Pro 


7296 


CAC TCT 
Hie Ser 
2430 


GCT 
Ala 


TCT 
Ser 


GAC 
Asp 


TGG TCA 
Trp Ser 
2435 


OAT 
Asp 


GTG 
Val 


ACC 
Thr 


ACC AGC 
Thr Ser 
2440 


CCT 
Pro 


ACC 
Thr 


CCT 
Pro 


GGG 
Gly 
2445 


7344 


GGT GCT 
Gly Ala 


GGA 
Gly 


GGA 
Gly 


GGT CAG CGG 
Gly Gin Arg 
2450 


GGA 
Gly 


CCT 
Pro 


GGG ACA 
Gly Thr 
2455 


CAC 
His 


ATG 
Met 


TCT 
Ser 


GAG CCA 
Glu Pro 
2460 


7392 


CCA 
Pro 


CAC 
His 


AAC 
Asn 


AAC 
Asn 


ATG 
Met 


CAG 
Gin 


GTT 
Val 


TAT 
Tyr 


GCG 
Ala 


TGAGAGAGTC CACCTCCAGT 




7439 



2465 2470 



GTAGAGACAT 


AACTGACTTT 


TGTAAATGCT 


GCTGAGGAAC 


AAATGAAGGT 


CATCCGGGAG 


7499 


AGAAATGAAG 


AAATCTCTGG 


AGCCAGCTTC 


TAGAGGTAGG 


AAAGAGAAGA 


TGTTCTTATT . 


7559 


CAGATAATGC 


AAG AG AAG CA 


ATTCGTCAGT 


TTCACTGGGT 


ATCTGCAAGG 


CTTATTGATT 


7619 


ATTCTAATCT 


AATAAGACAA 


GTTTGTGGAA 


ATGCAAGATC 


AAT AC AAG CC 


TTGGGTCCAT 


7679 


GTTTACTCTC 


TTCTATTTGG 


AGAATAAGAT 


GGATGCTTAT 


TGAAGCCCAG 


ACATTCTTGC 


7739 


AGCTTGGACT 


GCATTTTAAG 


CCCTGCAGGC 


TTCTGCCATA 


TCCATGAGAA 


GATTCTACAC 


7799 
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7859 
7919 
7979 
S039 
8099 
8159 



8279 
8339 
8399 
8459 



8639 
8699 



TAGCGTCCTQ TTGGGAATTA TGCCCTGCAA TTCTCCCTGA ATTGACCTAC GCATCTCCTC 
CTCCTTGGAC ATTCTTTTCT CTTCATTTGG TGCTTTTGGT TTTGCACCTC TCCGTGATTG 
TAGCCCTACC AGCATGTTAT AGGGCAAGAC CTTTGTGCTT TTGATCATTC TGGCCCATGA 
AAGCAACTTT GGTCTCCTTT CCCCTCCTGT CTTCCCGGTA TCCCTTGGAG TCTCACAAGG 
TTTACTTTGG TATGGTTCTC AGCACAAACC TTTCAAGTAT GTTGTTTCTT TGGAAAATGG 
ACATACTGTA TTGTGTTCTC CTGCATATAT CATTCCTGGA GAGAGAACGG GAGAAGAATA 
CTTTTCTTCA ACAAATTTTG GGGGCAGOAG ATCCCTTCAA GAGGCTGCAC CTTAATTTTT 8219 
CTTGTCTGTG TGCAGGTCTT CATATAAACT TTACCAGOAA GAAGGGTGTC. AGTTTGTTGT 
TTTTCTGTGT ATGGGCCTGG TCAGTGTAAA GTTTTATCCT TGATAGTCTA GTTACTATGA 
CCCTCCCCAC TTTTTTAAAA CCAGAAAAAG GTTTGGAATG TTGGAATGAC CAAGAGACAA 
GTTAACTCGT GCAAGAGCCA GTTACCCACC CACAGGTCCC CCTACTTCCT GCCAAGCATT 
CCATTGACTG CCTGTATGGA ACACATTTGT CCCAGATCTG AGCATTCTAG GCCTGTTTCA 8519 
CTCACTCACC CAGCATATGA AACTAGTCTT AACTGTTGAO CCTTTCCTTT CATATCCACA 8579 
GAAGACACTG TCTCAAATGT TGTACCCTTG CCATTTAGGA CTGAACTTTC CTTAGCCCAA 
GGGACCCAGT GACAGTTGTC TTCCGTTTGT CAGATGATCA GTCTCTACTG ATTATCTTOC 
TGCTTAAAGG CCTGCTCACC AATCTTTCTT TCACACCGTG TGGTCCGTGT TACTGGTATA 8759 
CCCAGTATGT TCTCACTGAA GACATGGACT TTATATOTTC AAGTGCAGGA ATTGGAAAGT 8819 
TGGACTTGTT TTCTATCATC CAAAACAGCC CTATAAGAAG GTTGGAAAAG CAGGAACTAT 8879 
ATAGCAGCCT TTGCTATTTT CTGCTACCAT TTCTTTTCCT CTGAAGCGGC CATGACATTC 8939 
CCTTTGGCAA CTAAOGTAGA AACTCAACAG AACATTTTCC TTTCCTAGAG TCACCTTTTA 8999 
GATGATAATG GACAACTATA GACTTGCTCA TTGTTCAGAC TGATTCCCCC TCACCTGAAT 9059 
CCACTCTCTG TATTCATCCT CTTGGCAATT TCTTTGACTT TCTTTTAAGG GCAGAAGCAT 9119 
TTTAGTTAAT TGTAGATAAA GAATAOTTTT CTTCCTCTTC TCCTTGGGCC AGTTAATAAT 9179 
TGGTCCATGG CTACACTGCA ACTTCCGTCC AGTGCTGTGA TGCCCATGAC ACCTGCAAAA 9239 
TAAGTTCTGC CTGGGCATTT TGTAGATATT AACAGGTGAA TTCCCGACTC TTTTGGTTTG 9299 
AATGACAGTT CTCATTCCTT CTATGGCTGC AAGTATGCAT CAGTGCTTCC CACTTACCTG 
ATTTGTCTGT CGGTGGCCCC ATATGGAAAC CCTGCGTGTC TGiTGGCATA ATAGTTTACA 
AATGGTTTTT TCAGTCCTAT CCAAATTTAT TGAACCAACA AAAATAATTA CTTCTGCCCT 
GAGATAAGCA GATTAAGTTT GTTCATTCTC TGCTTTATTC TCTCCATGTG GCAACATTCT 
GTCAGCCTCT TTCATAGTGT GCAAACATTT TATCATTCTA AATGGTGACT CTCTGCCCTT 
GGACCCATTT ATTATTCACA GATGGGGAGA ACCTATCTGC ATGGACCCTC ACCATCCTCT 
GTGCAGCACA CACAGTGCAG GGAGCCAGTG GCGATGGCGA TGACTTTCTT CCCCTGGGAA 



TTCC 



9359 
9419 
9479 
9539 
9599 
9659 
9719 
9723 
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WHAT IS CLAIMED IS : 

1. A pharmaceutical composition comprising a therapeutically 
effective amount of a Notch protein; and a pharmaceutically acceptable carrier. 

5 

2. The composition of claim 1 in which the Notch protein is a 
human Notch protein. 

3. A pharmaceutical composition comprising a therapeutically 
10 effective amount of a protein, said protein comprising an amino acid sequence 

encoded by the DNA sequence depicted in Figure 8A (SEQ ID NO:5), 8B (SEQ 
ID NO:6), 8C (SEQ ID NO:7), 9A (SEQ ID NO:8), or 9B (SEQ ID NO:9) ? 
which is able to be bound by an antibody to a Notch protein; and a 
pharmaceutically acceptable carrier. 

15 

4. A pharmaceutical composition comprising a therapeutically 
effective amount of a protein, said protein comprising a Notch amino acid 
sequence depicted in Figure 8A (SEQ ID NO:5), 8B (SEQ ID NO:6), 8C (SEQ 
ID NO:7), 9A (SEQ ID NO:8), or 9B (SEQ ID NO;9), which displays one or 

20 more functional activities associated with a full-length Notch protein; and a 
pharmaceutically acceptable carrier. 

5. A pharmaceutical composition comprising a therapeutically 
effective amount of a protein, said protein comprising a fragment of a human 

25 Notch protein consisting essentially of the extracellular domain of the protein; and 
a pharmaceutically acceptable carrier. 

6. A pharmaceutical composition comprising a therapeutically 
effective amount of a protein, said protein comprising a region of a Notch protein 

30 containing the EGF homologous repeats of the protein; and a pharmaceutically 
acceptable carrier. 
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7. A pharmaceutical composition comprising a therapeutically 
effective amount of a fragment of a Notch protein lacking a portion of the EGF- 
homologous repeats of the protein, which fragment is able to be bound by an 
antibody to a Notch protein; and a pharmaceutical ly acceptable carrier. 

5 

8. A pharmaceutical composition comprising a therapeutically 
effective amount of a protein, said protein comprising a functionally active 
portion of a Notch protein; and a pharmaceutical^ acceptable carrier. 

10 9. The composition of claim 8 in which the Notch protein is a 

human Notch protein. 

10. A pharmaceutical composition comprising a therapeutically 
effective amount of a chimeric protein, said chimeric protein comprising a 

IS functionally active portion of a human Notch protein joined via a peptide bond to 
a sequence of a protein different from the Notch protein; and a pharmaceutical^ 
acceptable carrier. 

1 1. The composition of claim 10 in which the functionally active 
20 portion of the Notch protein is encoded by the human cDN A sequence contained 

in plasmid hN3k as deposited with the ATCC and assigned accession number 
68609, or encoded by the human cDNA sequence contained in plasmid hN5k a* 
deposited with the ATCC and assigned accession number 68611. 

25 12. A pharmaceutical composition comprising a therapeutically 

effective amount of a protein, said protein comprising the amino acid sequence 
depicted in Figure 10 (SEQ ID NO: 11); and a pharmaceutical^ acceptable 
carrier. 

30 13. A pharmaceutical composition comprising a therapeutically 

effective amount of a protein, said protein comprising the amino acid sequence 
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depicted in Figure 11 (SEQ ID NO: 13); and a pharmaceutical^ acceptable 
carrier. 

14. A pharmaceutical composition comprising a therapeutically 
5 effective amount of a protein, said protein comprising the portion of a human 
Notch protein with the greatest homology to the epidermal growth factor-like 
repeats 11 and 12 of the Drosophila Notch sequence as shown in Figure 4 (SEQ 
ID NO: 14); and a pharmaceutical^ acceptable carrier. 

10 15. A pharmaceutical composition comprising a therapeutically 

effective amount of a derivative or analog of a Notch protein, which derivative or 
analog is characterized by the ability in vitro, when expressed on the surface of a 
first cell, to bind to a Delta protein expressed on the surface of a second cell; and 
a pharmaceutical^ acceptable carrier. 

15 

16. A pharmaceutical composition comprising a therapeutically 
effective amount of a chimeric protein, said chimeric protein comprising a Notch 
protein joined via a peptide bond to a protein sequence of a protein different from 
the Notch protein; and a pharmaceutical^ acceptable carrier. 

20 

17. A pharmaceutical composition comprising a therapeutically 
effective amount of a fragment of a Notch protein, whr 1 ! fragment is 
characterized by the ability in vitro, when expressed on the surface of a first cell, 
to bind to a Delta protein expressed on the surface of a second cell; and a 

25 pharmaceutical^ acceptable carrier. 

18. A pharmaceutical composition comprising a therapeutically 
effective amount of a chimeric protein, said chimeric protein comprising a 
fragment of a Notch protein joined via a peptide bond to a protein sequence of a 

30 protein different from the Notch protein, which fragment is characterized by the 
ability in vitro, when expressed on the surface of a first cell, to bind to a Delta 
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protein expressed on the surface of a second cell; and a pharmaceutical^ 
acceptable carrier. 

19. A pharmaceutical composition comprising a therapeutically 
5 effective amount of a protein, said protein comprising a derivative or analog of a 
Delta protein, which derivative or analog is characterized by the ability in vitro, 
when expressed on the surface of a first cell, to bind to a Notch protein expressed 
on the surface of a second cell; and a pharmaceutical^ acceptable carrier. 

10 20. A pharmaceutical composition comprising a therapeutically 

effective amount of a chimeric protein, said chimeric protein comprising a 
fragment of a Delta protein joined via a peptide bond to a protein sequence of a 
protein different from the Delta protein, which fragment is characterized by the 
ability in vitro, when expressed on the surface of a first cell, to bind to a Notch 

15 protein expressed on the surface of a second cell; and a pharmaceutical^ 
acceptable carrier. 

21. A pharmaceutical composition comprising a therapeutically 
effective amount of a protein, said protein comprising a derivative or analog of a 

20 Serrate protein, which derivative or analog is characterized by the ability in vitro, 
when expressed on the surface of a first cell, to bind to a Notch protein expressed 
on the surface of a second cell; and a pharmaceutical ly acceptable carrier. 

22. A pharmaceutical composition comprising a therapeutically 
25 effective amount of a derivative or analog of a Notch protein, which derivative or 

analog is characterized by the ability in vitro, when expressed on the surface of a 
first cell, to bind to a second protein expressed on the surface of a second cell, 
which second protein is selected from the group consisting of a Delta protein and 
a Serrate protein; and a pharmaceutical^ acceptable carrier. 

30 
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23. A pharmaceutical composition comprising a therapeutically 
effective amount of a nucleic acid encoding a Notch protein; and a 
pharmaceutical ly acceptable carrier. 

5 24. A pharmaceutical composition comprising a therapeutically 

effective amount of a nucleic acid encoding a functionally active portion of a 
human Notch protein; and a pharmaceutical ly acceptable cr;ier. 

25. A pharmaceutical composition comprising a therapeutically 
10 effective amount of a nucleic acid encoding the amino acid sequence depicted in 

Figure 10 (SEQ ID NO: 11); and a phn:maceutically acceptable carrier. 

26. A pharmaceutical composition comprising a therapeutically 
effective amount of a nucleic acid encoding the amino acid sequence depicted in 

15 Figure 11 (SEQ ID NO: 13); and a pharmaceutical^ acceptable carrier. 

27. A pharmaceutical composition comprising a therapeutically 
effective amount of a nucleic acid encoding a fragment of a Notch protein, which 
fragment is characterized by the ability in vf/ro, when expressed on the surface of 

20 a first cell, to bind to a Delta protein expressed on the surface of a second cell; 
and a pharmaceutical^ acceptable carrier. 

28. A pharmaceutical composition comprising a therapeutically 
effective amount of a nucleic acid encoding a chimeric protein, said chimeric 

25 protein comprising a functionally active fragment of a human Notch protein joined 
via a peptide bond to a protein sequence of a protein different from the Notch 
protein; and a pharmaceutical ly acceptable carrier. 

29. A pharmaceutical composition comprising a therapeutically 
30 effective amount of a nucleic acid encoding a fragment of a Delta protein, which 

fragment is characterized by the ability in v/7ro, when expressed on the surface of 



35 



PCT/US93/09338 

WO 94/07474 ^ 



a first cell, to bind to a Notch protein expressed on the surface of a second cell; 
and a pharmaceutically acceptable carrier. 

; 30. A pharmaceutical composition comprising a therapeutically 

5 effective amount of a nucleic acid encoding a fragment of a Serrate protein, which 
fragment is characterised by the ability in vitro, when expressed on the surface of 
a first cell, to bind to a Notch protein expressed on the surface of a second cell; 
and a pharmaceutically acceptable carrier. 

10 31 . The composition of claim 24 in which the nucleic acid is a 

nucleic acid vector. 

32. A pharmaceutical composition comprising a therapeutically 
effective amount of an antibody which binds to a Notch protein; and a 

IS pharmaceutically acceptable carrier. 

33. A pharmaceutical composition comprising a therapeutically 
effective amount of a fragment or derivative of an antibody to a Notch protein 
containing the idiotype of the antibody; and a pharmaceutically acceptable carrier. 



20 



34. A method of treating or preventing a disease or disorder in a 
subject comprising administering to a subject in need of such treatment or 
prevention a therapeutically effective amount of a molecule which antagonizes the 
function of a Notch protein. 



25 



35. The method according to claim 34 in which the disease or 
disorder is a malignancy characterized by increased Notch activity or increased 
expression of a Notch protein or of a Notch derivative capable of being bound by 
an anti-Notch antibody, relative to said Notch activity or expression in an 
30 analogous non-malignant sample. 
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36. The method according to claim 34 in which the disease or 
disorder is cervical cancer. 

37. The method according to claim 34 in which the disease or 
5 disorder is breast cancer. 

38. Hie method according to claim 34 in which the disease or 
disorder is colon cancer. 

10 39. The method according to claim 35 in which the malignancy is 

selected from the group consisting of melanoma, seminoma, and lung cancer. 

40. The method according to claim 35 in which the subject is a 

human. 

15 

41. The method according to claim 36, 37 or 38 in which the 
molecule is an antibody to Notch or a portion of said antibody containing the 
binding domain thereof. 

20 42. The method according to claim 36, 37 or 38 in which the 

molecule is a protein consisting of at least the extracellular domain of a Notch 
protein or a portion thereof capable of binding to a Notch iigand. 



43. The method according to claim 36, 37 or 38 in which the 
25 molecule is a protein consisting of at least the EGF homologous repeats of a 

Notch protein. 

44. The method according to claim 36, 37 or 38 in which the 
molecule is a protein consisting of at least an adhesive fragment of a Notch 

30 protein. 
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45. The method according to claim 36, 37 or 38 in which the 
molecule is an oligonucleotide which (a) consists of at least six nucleotides; (b) 
comprises a sequence complementary to at least a portion of an RNA transcript of 
a Notch gene; and (c) is hybridizable to the RNA transcript. 

46. A method of treating or preventing a disease or disorder in a 
subject in need of such treatment or prevention comprising administering to the 
subject a therapeutically effective amount of a molecule which promotes the 
function of a Notch protein. 

47. A method of treating or preventing a malignancy in a subject 
comprising administering to a subject in need of such treatment or prevention an 
effective amount of a Notch protein. 



48. A method of treating or preventing a malignancy in a subject 
comprising administering to a subject in need of such treatment or prevention an 
effective amount of a functionally active portion of a Notch protein. 

49. The method according to claim 47 in which the Notch protein 
20 is a human Notch protein. 

50. A method of treating or preventing a malignancy in a subject 
comprising administering to a subject in need of such treatment or prevention an 
effective amount of a chimeric protein, said protein comprising a functionally 

25 active portion of a Notch protein joined via a peptide bond to a protein sequence 
of a protein different from the Notch protein. 

51. The method according to claim 49 in which the human Notch 
protein comprises the amino acid sequence depicted in Figure 10 (SEQ ID 

30 NO:ll) or Figure 11 (SEQ ID NO: 13). 
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52. A method of treating or preventing a malignancy in a subject 
comprising administering to a subject in need of such treatment or prevention an 
effective amount of a derivative or analog of a Notch protein, which derivative or 
analog is characterized by the ability in vitro, when expressed on the surface of a 

5 first cell, to bind to a second protein expressed on the surface of a second cell, 
which second protein is selected from the group consisting of a Delta protein and 
a Serrate protein. 

53. A method of treating or preventing a malignancy in a subject 
10 comprising administering to a subject in need of such treatment or prevention an 

effective amount of a derivative or analog of a Delta protein, which derivative or 
analog is characterized by the ability in vitro, when expressed on the surface of a 
first cell, to bind to a Notch protein expressed on the surface of a second cell. 

15 54, a method of treating or preventing a malignancy in a subject 

comprising administering to a subject in need of such treatment or prevention an 
effective amount of a derivative or analog of a Serrate protein, which derivative 
or analog is characterized by the ability in vitro, when expressed on the surface of 
a first cell, to bind to a Notch protein expressed on the surface of a second cell. 

20 

55. A method of treating or preventing a malignancy in a subject 
comprising administering to a subject in need of such treatment or prevention an 
effective amount of a nucleic acid encoding a Notch protein. 

25 56. A method of treating or preventing a malignancy in a subject 

comprising administering to a subject in need of such treatment or prevention an 
effective amount of a nucleic acid encoding a functionally active portion of a 
Notch protein. 

30 57. The method according to claim 55 in which the subject is 

human and the Notch protein is a human Notch protein. 
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58. A method of treating or preventing a malignancy in a subject 
comprising administering to a subject in need of such treatment or prevention an 
effective amount of a nucleic acid encoding a fragment of a Notch protein, which 
fragment is characterized by the ability in vitro, when expressed on the surface of 

5 a first cell, to bind to a second protein expressed on the surface of a second cell, 
which second protein is selected from the group consisting of a Delta protein and 
a Serrate protein. 

59. A- method of treating or preventing a malignancy in a subject 
10 comprising administering to a subject in need of such treatment or prevention an 

effective amount of a nucleic acid encoding a fragment of a Delta protein, which 
fragment is characterized by the ability in vitro, when expressed on the surface of 
a first cell, to bind to a Notch protein expressed on the surface of a second cell. 

15 60. A method of treating or preventing a malignancy in a subject 

comprising administering to a subject in need of such treatment or prevention an 
effective amount of a nucleic acid encoding a fragment of a Serrate protein, which 
fragment is characterized by the ability in vitro , when expressed on the surface of 
a first cell, to bind to a Notch protein expressed on the surface of a second cell. 

20 

61 . A method of treating or preventing a malignancy in a subject 
comprising administering to a subject in need of such treatment or prevention an 
effective amount of antibody to a Notch protein. 

25 62. The method according to claim 58 in which the antibody is 

monoclonal. 

63. A method for treating a patient with a tumor, of a tumor type 
characterized by expression of a lioKh gene, comprising administering to the 
30 patient an effective amount of an oligonucleotide, which oligonucleotide (a) 

consists of at least six nucleotides; (b) comprises a sequence complementary to at 
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least a portion of an RNA transcript of the Notch gene; and (c) is hybridizable to 
the RNA transcript. 

64. The method according to claim 60 in which the patient is a 
5 human, and the Notch gene is a human gene. 

65. An isolated oligonucleotide consisting of at least six 
nucleotides, and comprising a sequence complementary to at least a portion of an 
RNA transcript of a Notch gene, which oligonucleotide is hybridizable to the 

10 RNA transcript. 

66. A pharmaceutical composition comprising the oligonucleotide 
of claim 65; and a pharmaceutically acceptable carrier. 

15 67. A method of inhibiting the expression of a nucleic acid 

sequence encoding a Notch protein in a cell comprising providing the cell with an 
effective amount of the oligonucleotide of claim 65. 

68. A method of diagnosing a disease or disorder characterized by 
20 an aberrant level of Notch protein or activity in a patient, comprising measuring 

the level of Notch protein expression or activity in a sample derived from the 
patient, in which an increase or decrease in Notch prote^ or activity in the 
patient sample relative to the level found in such a sample from a normal 
individual indicates the presence of the disease or disorder in the patient. 

25 

69. A method of diagnosing a malignancy characterized by an 
increased amou, ;* of a Notch protein or of a Notch derivative capable of being 
bound by an anti-Notch antibody, comprising measuring the amount of a Notch 
protein or of a Notch derivative capable of being bound by an anti-Notch 

30 antibody, in a sample containing or suspected of containing malignant cells from a 
patient, in which an increase in the amount of the Notch protein or of the Notch 
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derivative capable of being bound by an anti-Notch antibody, in the sample, 
relative to said amount found in an analogous sample of non-malignant cells 
indicates the presence of the disease or disorder in the patient. 

70. The method according to claim 69 in which the malignancy is 
cervical cancer. 

71. The method according to claim 69 in which the malignancy is 

breast cancer. 

72. The method according to claim 69 in which the malignancy is 

colon cancer. 

73. The method according to claim 69 in which the malignancy is 
15 selected from the group consisting of melanoma, seminoma, and lung cancer. 

74. The method according to claim 69 in which the amount of the 
Notch protein or derivative is measured by a method comprising contacting the 
sample with an anti-Notch antibody such that immunospecific binding can occur, 

20 and measuring the amount of any immunospecific binding of the antibody that 
occurs. 



5 



10 



75. A method of treawiig or preventing a nervous system disorder 
in a subject comprising administering to a subject in need of such treatment or 

25 prevention an effective amount of a functionally active portion of a Notch protein. 

76. A method of promoting tissue regeneration or repair in a 
subject comprising administering to a subject an effective amount of a functionally 
active portion of a Notch protein. 

30 
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77. A method of treating a benign dysproliferative disorder in a 
subject comprising administering to a subject in need of such treatment an 
effective amount of a functionally active portion of a Notch protein, in which the 
disorder is selected from the group consisting of cirrhosis of the liver, psoriasis, 
keloids, and baldness. 

i 

78. A substantially purified human Notch protein comprising the 
amino acid sequence encoded by the hN homoiog as depicted in Figure 13 from 
amino acid numbers 1 through 2169 (SEQ ID NO: 19). 

79. A substantially pun;ied human Notch protein comprising the 
amino acid sequence encoded by the hN homoiog as depicted in Figure 13 from 
amino acid numbers about 26 through 2169 (as contained in SEQ ID NO:19). 



15 80. A substantially purified protein comprising the extracellular 

domain of the mature human Notch protein encoded by the hN homoiog, as 
depicted in Figure 13 from amino acid numbers about 26 through 1677 (as 
contained in SEQ ID NO: 19). 

20 81. A substantially purified protein comprising the EGF 

homologous repeats of the mature human Notch protein encoded by the hN 
homoiog, as depicted in Figure 13 from amino acid numbers 26 through 1413 (as 
contained in SEQ ID NO: 19). 

25 82. A substantially purified protein comprising the EGF like 

repeats 1 1 and 12 of the mature human Notch protein encoded by the hN 
homoiog, as depicted in Figure 13 (as contained in SEQ ID NO: 19). 

83. A substantially purified protein consisting essentially of the 
30 extracellular domain of the mature human Notch protein encoded by the hN 
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homolog, as depicted in Figure 13 from amino acid numbers about 26 through 
1677 (as contained in SEQ ID NO: 19). 

84. A substantially purified nucleic acid encoding the protein of 

5 claim 78. 

85. A substantially purified nucleic acid encoding the protein of 

claim 79. 

10 86. A substantially purified nucleic acid encoding the protein of 

claim 80. 

87. A substantially purified nucleic acid encoding the protein of 

claim 82. 

15 

88. The nucleic acid of claim 85 which is a DNA molecule 
comprising the sequence depicted in Figure 17 from nucleotide numbers 82 
through 7419 (as contained in SEQ ID NO:21). 

20 89. The nucleic acid of claim 80 in which the sequence encoding 

the extracellular domain, is as presented in Figure 17 (as contained in 
SEQ ID NO:21). 

90. A recombinant cell containing the nucleic acid of claim 84, 87 

25 or 88. 

91. The composition of claim 2 in which the Notch protein 
comprises the amino acid sequence encoded by the hN homolog as depicted in 
Figure 13 from amino acid numbers 26 through 2169 (as contained in 

30 SEQ ID NO: 19). 
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92. A composition comprising a therapeutically effective amount 
of a Notch protein or Notch derivative, said derivative being capable of being 
bound by an anti-Notch antibody, for use as a medicament. 

5 93. A composition comprising a therapeutically effective amount 

of a molecule which' antagonizes the function of a Notch protein, for use as a 
medicament. 

94. Use of a composition comprising a molecule which 
10 antagonizes the function of a Notch protein, for the manufacture of a medicament 
for the treatment of cervical cancer, breast cancer, or colon cancer. 



15 



20 



25 - 



30 
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1/68 

GAATTCGGAG GAATTATTCA AAACATAAAC ACAATAAACA ATTTGAGTAG TTGCCGCACA 60 

CACACACACA CACAGCCCGT GGATTATTAC ACTAAAAGCG ACACTCAATC CAAAAAATCA 120 

GCAACAAAAA CATCAATAAA C ATG CAT TGG ATT AAA TGT TTA TTA ACA GCA 171 

Met His Trp He Lys Cys Leu Leu Thr Ala 
1 5 10 

TTC ATT TGC TTC ACA GTC ATC GTG CAG GTT CAC AGT TCC GGC AGC TTT 219 
Phe He Cys Phe Thr Vol He Vol Gin Val His Ser Ser Gly Ser Phe 
15 20 25 

GAG TTG CGC CTG AAG TAC TTC AGC AAC GAT CAC GGG CGG GAC AAC GAG 267 
Glu Leu Arg Leu Lys Tyr Phe Ser Asn Asp His Gly Arg Asp Asn Glu 
30 35 40 

GGT CGC TGC TGC AGC GGG GAG TCG GAC GGA GCG ACG GGC AAG TGC CTG 315 
Gly Arg Cys Cys Ser Gly Glu Ser Asp Gly Ala Thr Gly Lys Cys Leu 
45 50 55 

GGC AGC TGC AAG ACG CGG TTT CGC GTC TGC CTA AAG CAC TAC CAG GCC 363 
Gly Ser Cys Lys Thr Arg Phe Arg Val Cys Leu Lys His Tyr Gin Ala 
60 65 70 

ACC ATC GAC ACC ACC TCC CAG TGC ACC TAC GGG GAC GTG ATC ACG CCC 411 
Thr He Asp Thr Thr Ser Gin Cys Thr Tyr Gly Asp Val He Thr Pro 
75 80 85 90 

ATI CTC GGC GAG AAC TCG GTC AAT CTG ACC GAC GCC CAG CGC TTC CAG ' 459 
He Leu Gly Glu Asn Ser Val Asn Leu Thr Asp Ala Gin Arg Phe Gin 
95 100 105 

AAC AAG GGC TTC ACG AAT CCC ATC CAG TTC CCC TTC TCG TTC TCA TGG 507 
Asn Lys Gly Phe Thr Asn Pro He Gin Phe Pro Phe Ser Phe Ser Trp 
110 115 120 
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CCG GGT ACC TTC TCG CTG ATC GTC GAG GCC TGG CAT GAT ACG AAC AAT 555 
Pro Gly Thr Phe Ser Leu He Val Glu Ala Tro His Asp Thr Asn Asn 
125 130 ' 135 

AGC GGC AAT GCG CGA ACC AAC AAG CTC CTC ATC CAG CGA CTC TTG GTG 603 
Ser Gly Asn Ala Arg Thr Asn Lys Leu Leu He Gin Arg Leu Leu Val 
140 ' 145 150 

CAG CAG GTA CTG GAG GTG TCC TCC GAA TGG AAG ACG AAC AAG TCG GAA 651 
Gin Gin Val Leu Glu Val Ser Ser Glu Trp Lys Thr Asn Lys Ser Glu 
155 160 165 170 

TCG CAG TAC ACG TCG CTG GAG TAC GAT TTC CGT GTC ACC TGC GAT CTC 699 
Ser Gin Tyr Thr Ser Leu GL Tyr Asp Phe Arg Val Thr Cys Asp Leu 
175 180 185 

AAC TAC TAC GGA TCC GGC TGT GCC AAG TTC TGC CGG CCC CGC GAC GAT 747 
Asn Tyr Tyr Gly Ser Gly Cys Ala Lys Phe Cys Arg Pro Arg Asp Asp 
190 195 200 

TCA TTT GGA CAC TCG ACT TGC TCG GAG ACG GGC GAA ATT ATC TGT TTG 795 
Ser Phe Gly His Ser Thr Cys Ser Glu Thr Gly Glu He lie Cys Leu 
205 210 215 

ACC GGA TGG CAG GGC GAT TAC TGT CAC ATA CCC AAA TGC GCC AAA GGC 843 
Thr Gly Trp Gin Glv Asp Tyr Cys His lie Pro Lys Cys Ala Lys Gly 
220 ' 225 230 

TGT GAA CAT GGA CAT TGC GAC AAA CCC AAT CAA TGC GTT TGC CAA CTG 891 
Cys Glu His Gly His Cys Asp Lys Pro Asn Gin Cys Val Cys Gin Leu 
235 240 245 250 

GGC TGG AAG GGA GCC TTG TGC AAC GAG TGC GTT CTG GAA CCG AAC TGC 939 
Gly Trp Lys Gly Ala Leu Cys Asn Glu Cys Val Leu Glu Pro Asn Cys 
255 260 265 
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ATC CAT GGC ACC TGC AAC AAA CCC TGG ACT TGC ATC TGC AAC GAG GGT 987 

He His Gly Thr Cys Asn Lys Pro Tro Fhr Cys He Cys Asn Glu Gly 
270 275 £80 

TGG GGA GGC TTG TAC TGC AAC CAG GAT CTG AAC TAC TGC ACC AAC CAC 1035 
Trp Gly Gly Leu Tyr Cys Asn Gin Asp Leu Asn Tyr Cys Thr Asn His 
285 290 295 

AGA CCC TGC AAG AAT GGC GGA ACC TGC TTC AAC ACC GGC GAG GGA TTG 1083 
Arg Pro Cys Lys Asn Gly Gly Thr Cys Phe Asn Thr Gly Glu Gly Leu 
300 305 310 

TAC ACA TGC AAA TGC GCT CCA GGA TAC AGT GGT GAT GAT TGC GAA AAT 1131 
Tyr Thr Cys Lys Cys Alo Pro Gly Tyr Ser Gly Asp Asp Cys Glu Asn 
315 320 325 330 

GAG ATC TAC TCC TGC GAT GCC GAT GTC AAT CCC TGC CAG AAT GGT GGT 1179 
Glu lie Tyr Ser Cys Asp Ala Asp Val Asn Pro Cys Gin Asn Gly Gly 
335 340 345 

ACC TGC ATC GAT GAG CCG CAC ACA AAA ACC GGC TAC AAG TGT CAT TGC 1227 
Thr Cys He Asp Glu Pro His Thr Lys Thr Gly Tyr Lys Cys His Cys 
350 355 360 

GCC AAC GGC TGG AGC GGA AAG ATG TGC GAG GAG AAA GTG CTC ACG TGT 1275 
Ala Asn Gly Trp Ser Gly Lys Met Cys Glu Glu Lys Val Leu Thr Cys 
3&5 370 375 

TCG GAC AAA CCC TGT CAT CAG GGA ATC TGC CGC AAC GTT CGT CCT GGC 1323 
Ser Asp Lys Pro Cys His Gin Gly He Cys Arg Asn Vol Arg Pro Gly 
380 385 390 

TTG GGA AGC AAG GGT CAG GGC TAC CAG TGC GAA TGT CCC ATT GGC TAC 1371 
Leu Gly Ser Lys Gly Gin Gly Tyr Gin Cys Glu Cys Pro He Gly Tyr 
395 400 405 410 
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Ser Gly Pro Asn Cys Asp 
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540 
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AGT CCG AAT CCA 1419 
Ser Pro Asn Pro 
425 

TGT ATT TGC CCA 1467 
Cys He Cys Pro 
440 

GAC GAT TGT CTT 1515 
Asp Asp Cys Leu 
455 

ATG GTC AAC CAA 1563 
Met Val Asn Gin 



CAC TGT AGT AGC 1611 
His Cys Ser Ser 
490 

GGA GGA ACC TGC 1659 
Gly Gly Thr Cys 
505 

GCG GGA TTT ACT 1707 
Ala Gly Phe Thr 
520 

AGT GGA CCC TGT 1755 
Ser Gly Pro Cys 
535 

TTC GAA TGC GTG 1803 
Phe Glu Cys Val 
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TGT GCC AAT GGT TTC AGG GGC AAG CAG TGC GAT GAG GAG TCC TAC GAT 1851 
Cys Ala Asn Gly Phe Arg Gly Lys Gin Cys Asp Glu Glu Ser Tyr Asp 
555 560 565 570 

TCG GTG Ad TTC GAT GCC CAC CAA TAT GGA GCG ACC ACA CAA GCG AGA 1899 
Ser Val Thr Phe Asp Ala His Gin Tyr Gly Ala Thr Thr Gin Ala Arg 
575, 580 585 

GCC GAT GGT TTG ACC AAT GCC CAG GTA GTC CTA ATT GCi-.GTi TTC TCC 1947 
Ala Asp Gly Leu Thr Asn Ala Gin Val Val Leu He Ala Val Phe Ser 
590 595 600 

GTT GCG ATG CCT TTG GTG GCG GTT ATT GCG GCG TGC GTG GTC TTC TGC 1995 
Val Ala Met Pro Leu Val Ala Val He Ala Ala Cys Val Val Phe Cys 
605 610 615 

ATG AAG CGC AAG CGT AAG CGT GCT CAG GAA AAG GAC GAC GCG GAG GCC 2043 
Met Lys Arg Lys Arg Lys Arg Ala Gin Glu Lys Asp Asp Ala Glu Ala 
620 625 630 

AGG AAG CAG AAC GAA CAG AAT GCG GTG GCC ACA ATG CAT CAC AAT GGC 2091 
Arg Lys Gin Asn Glu Gin Asn Ala Val Ala Thr Met His His Asn Gly 
635 640 645 650 

AGT GGG GTG GGT GTA GCT TTG GCT TCA GCC TCT CTG GGC GGC AAA ACT 2139 
Ser Gly Val Gly Val Ala Leu Ala Ser Ala Ser Leu Gly Gly Lys Thr 
655 660 665 

GGC AGC AAC AGC GGT CTC ACC TTC GAT GGC GGC AAC CCG AAT ATC ATC 2187 
Gly Ser Asn Ser Gly Le 11 Thr Phe Asp Gly Gly Asn Pro Asn He lie 
670 675 680 

AAA AAC ACC TGG GAC AAG TCG GTC AAC AAC ATT TGT GCC TCA GCA GCA 2235 
Lys Asn Thr Trp Asp Lys Ser Val Asn Asn He Cys Ala Ser Ala Ala 
685 690 695 



FIG.1E 



SUBSTITUTE SHEET (RULE 26) 



WO 94/07474 



PCT/US93/09338 



6/68 

GCA GCG GCG GCG GCG GCA GCA GCG GCG GAC GAG TGT CTC ATG TAC GGC 2283 
Ala Ala Ala Ala Ala Ala Ala Ala Ala Asp Glu Cys Leu Met Tyr Gly 
700 705 710 

GGA TAT GTG GCC TCG GTG GCG GAT. AAC AAC AAT GCC AAC TCA GAC TTT 2331 
Gly Tyr Val Ala Ser Val Ala Asp Asn Asn Asn Ala Asn Ser Asp Phe 
715 720 725 730 

TGT GTG GCT CCG CTA CAA AGA GCC AAG TCG CAA AAG CAA CTC AAC ACC 2379 
Cys Val Ala Pro Leu Gin Arg Ala Lys Ser Gin Lys Gin Leu Asn Thr 
735 740 745 

GAT CCC ACG CTC ATG CAC CGC GGT TCC CCG GCA GGC AGC TCA GCC AAG 2427 
Asp Pro Thr Leu Met His Arg Gly Ser Pro Ala Gly Ser Ser Ala Lys 
750 755 760 

GGA GCG TCT GGC GGA GGA CCG GGA GCG GCG GAG GGC AAG AGG ATC TCT 2475 
Gly Ala Ser Gly Gly Gly Pro Gly Ala Ala Glu Gly Lys Arg lie Ser 
765 770 775 

GTT TTA GGC GAG GGT TCC TAC TGT AGC CAG CGT TGG CCC TCG TTG GCG 2523 
Val Leu Gly Glu Gly Ser Tyr Cys Ser Gin Arg Trp Pro Ser Leu Ala 
780 785 790 

GCG GCG GGA GTG GCC GGA GCC TGT TCA TCC CAG CTA ATG GCT GCA GCT 2571 
Ala Ala Gly Val Ala Gly Ala Cys Ser Ser Gin Leu Met Ala Ala Ala 
795 800 805 810 

TCG GCA GCG GGC AGC GGA GCG GGG ACG GCG CAA CAG CAG CGA TCC GTG 2619 
Ser Ala Ala Gly Ser Gly Ala Gly Thr Ala Gin Gin Gin Arg Ser Val 
815 820 825 

GTC TGC GGC ACT CCG CAT ATG TAACTCCAAA AATCCGGAAG GGCTCCTGGT 2670 
Val Cys Gly Thr Pro His Met 
830 

AAATCCGGAG AAATCCGCAT GGAGGAGCTG ACAGCACATA CACAAAGAAA AGACTGGGTT 2730 

GGGTTCAAAA TGTGAGAGAG ACGCCAAAAT GTTGTTGTTG ATTGAAGCAG TTTAGTCGTC 2790 

ACGAAAAATG AAAAATCTGT AACAGGCATA ACTCGTAAAC TCCCTAAAAA ATTTGTATAG 2850 

TAATTAGCAA AGCTGTGACC CAGCCGTTTC GATCCCGAAT TC 2892 
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1 GAATTCCGCT GGGAGAATGG TCTGAGCTAC CTGCCCGTCC IGCTGGGGCA TCAATGGCAA 

61 GTGGGGAAAG CCACACTGGG CAAACGGGCC AGGCCATTTC TGGAATGTGG TACATGGTGG 

121 GCAGGGGGCC CGCAACAGCT GGAGGGCAGG TGGACTGAGG CTGGGGATCC CCCGCTGGTT 

181 GGGCAATACT GCCTTTACCC ATGAGCTGGA AAGICACAAT GGGGGGCAAG GGCTCCCGAG 

241 GGTGGTTATG TGCTTCCTTC AGGTGGC 



FIG.8A 



CATTATACGT GACTTTTCTG AAACTGTAGC CACCCTAGTG TCTCTAACTC 

TTGTCAGCTT TGGTCTTTTC AAAGAGCAGG CTCTCTTCAA GCTCCTTAAT 

TCCAGTTTGG TCTGCGTCTC AAGATCACCT TTGGTAATTG ATTCTTCTTC 

TGAAGGCTGG CTCTCACCCT CTAGGCAGAG CAGGAATTCC GAGGTGGATG 

GAATGTCCGT GGCCCAGATG GCTGCACCCC ATTGATGTTG GCTTCTCTCC 

CTCAGATTTG AGTGATGAAG ATGAAGATGC AGAGGACTGT TCTGCTAACA 

CTTGGTCTAC CAGGGTGCCA GCCTCCAGAC CAGACAGACC GGACTGGTGA 

CACCTTGCAG CCCGCTACTC ACGGGCTGAT GCTGCCAAGC GTCTCCTGGA 

GATGCCAATG CCCAGGACAA CATGGGCCGC TGTCCACTCC ATGCTGCAGT 
GCCAAGGTGT ATTCAGATCT GTTA 



FIG.8B 



GATTCGCAAC CGAGTAACTG ATCTAGATGC CAGGATGAAT GATGGTACTA 
CCTGGCTGCC CGCCTGGCTG TGGAGGGAAT GGTGGCAGAA CTGATCAACT 
TGTGAATGCA GTGGATGACC ATGGAAAATC TGCTCTTCAC TGGGCAGCTG 
TGTGGAGGCA ACTCTTTTGT TGTTGAAAAA TGGGGCCAAC CGAGACATGC 
GGAAGAGACA CCTCTGTTTC TTGCTGCCCG GGAGGAGCTA TAAGC 



FIG.8C 
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1 GAATTCCTTC 

61 CCTCTGGAGT 

121 GCGGGCATGC 

181 AACCCGGAAC 

241 TGTTAGATGT 

301 GAGGAGGCAG 

361 TCATCACAGA 

421 GATGGCCCTG 

481 TGCAGGTGCA 

541 GGCACGTGAT 



1 TCCAGATTCT 

61 CACCCCTGAT 

121 GCCAAGCGGA 

181 CTGTCAATAA 

241 AGGACAACAA 
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1 GAATTCCATT CAGGAGGAAA GGGTGGGGAG AGAAGCAGGC ACCCACTTTC CCGTGGGTGG 
61 ACTCGTTCCC AGGTGGCTCC ACCGGCAGCT GTGACCGCCG CAGGTGGGGG CGGAGTGCCA 
121 TTCAGAAAAT TC'CAGAAAAG CCCTACCCCA ACTCGGACGG CAACGTCACA CCCGTGGGTA 
181 GCAACTGGCA CACAAACAGC CAGCGTGTCT GGGGCACGGG GGGATGGCAC CCCCTGCAGG 
241 CAGAGCTG ' 

FIG.9A 



1 CTAAAGGGAA CAAAAGCNGG AGCTCCACCG CGGGCGGCNC NGCTCTAGAA CTAGTGGANN 

61 NCCCGGGCTG CAGGAATTCC GGCGGACTGG GCTCGGGCTC AGAGCGGCGC TGTGGAAGAG 

121 ATTCTAGACC GGGAGAACAA GCGAATGGCT GACAGCTGGC CTCCAAAGTC ACCAGGCTCA 

181 AATCGCTCGC CCTGGACATC GAGGGATGCA GAGGATCAGA ACCGGTACCT GGATGGCATG 

241 ACTCGGATTT ACAAGCATGA CCAGCCTGCT TACAGGGAGC GTGANNTTTT CACATGCAGT 

301 CGACAGACAC GAGCTCTATG CAT 



FIG.9B 
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G GAG GTG GAT GTG TTA GAT GTG AAT GTC CGT GGC CCA GAT GGC TGC 46 
Glu Vol Asp Val Leu Asp Vol Asn Vol Arg Gly Pro Asp Gly Cys 
1 5 10 15 

ACC CCA TTG ATG TTG GCT TCT CTC CGA GGA GGC AGC TCA GAT TTG AGT 94 
Thr Pro Leu Met Leu Ala Ser Leu Arg Gly Gly Ser Ser Asp Leu Ser 
20 25 50 

GAT GAA GAT GAA GAT GCA GAG GAC TCT TCT GCT AAC ATC ATC ACA GAC 142 
Asp Glu Asp Glu Asp Ala Glu Asp Ser Ser Ala Asn He He Thr Asp 
35 40 45 

TTG GTC TAC CAG GGT GCC AGC CTC CAG GCC CAG ACA GAC CGG ACT GGT 190 
Leu Val Tyr Gin Gly Ala Ser Leu Gin Ala Gin Thr Asp Arg Thr Gly 
50 55 60 

GAG ATG GCC CTG CAC CTT GCA GCC CGC TAC TCA CGG GCT GAT GCT GCC 238 
Glu Met Ala Leu His Leu Ala Ala Arg Tyr Ser Arg Ala Asp Ala Ala 
65 70 75 

AAG CGT CTC CTG GAT GCA GGT GCA GAT GCC AAT GCC CAG GAC AAC ATG 286 
Lvs Aro Leu Leu Asp Ala Gly Ala Asp Ala Asn Ala Gin Asp Asn Met 
80 85 90 95 

GGC CGC TGT CCA CTC CAT GCT GCA GTG GCA GCT GAT GCC CAA GGT GTC 334 
Gly Arg Cys Pro Leu His Ala Ma Val Ala Ala Asp Ala Gin Gly Val 
100 105 110 

TTC CAG ATT CTG ATT CGC AAC CGA GTA ACT GAT CTA GAT GCC AGG ATG 382 
Phe Gin lie Leu lie Arg Asn Arg Val Thr Asp Leu Asp Ala Arg Met 
115 120 125 

AAT GAT GGT ACT ACA CCC CTG ATC CTG GCT GCC CGC CTG GCT GTG GAG 430 
Asn Asp Gly Thr Thr Pro Leu lie Leu Ala Ala Arg Leu Ala Val Glu 
130 135 140 
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GGA ATG GTG GCA GAA CTG ATC AAC TGC CAA GCG GAT GTG AAT GCA GTG 
Gly Met Val Ala Glu Leu He Asn Cys Gin Ala Asp Val Asn Ala Val 
145 150 155 

GAT GAC CAT GGA AAA TCT GCT CTT CAC TGG GCA GCT GCT GTC AAT AAT 
Asp Asp His Gly Lys Ser Ala Leu His Trp Ala Ala Ala Val Asn Asn 
160 ' 165 170 175 

GTG GAG GCA ACT CTT TTG TTG TTG AAA AAT GGG GCC AAC'CGA GAC ATG 
Val Glu Ala Thr Leu Leu Leu Leu Lys Asn Gly Ala Asn Arg Asp Met 
180 185 190 

CAG GAC AAC AAG GAA GAG ACA CCT CTG TTT CTT GCT GCC CGG GAG GGG 
Gin Asp Asn Lys Glu Glu Thr Pro Leu Phe Leu Ala Ala Arg Glu Gly 
195 200 205 

AGC TAT GAA GCA GCC AAG ATC CTG TTA GAC' CAT TTT GCC AAT CGA GAC 
Ser Tyr Glu Ala Ala Lys He Leu Leu Asp His Phe Ala Asn Arg Asp 
210 215 220 

ATC ACA GAC CAT ATG GAT CGT CTT CCC CGG GAT GTG GCT CGG GAT CGC 
He Thr Asp His Met Asp Arg Leu Pro Arg Asp Val Ala Arg Asp Arg 
225 230 235 

ATG CAC CAT GAC ATT GTG CGC CTT CTG GAT GAA TAC AAT GTG ACC CCA 
Met His His Asp He Val Arg Leu Leu Asp Glu Tyr Asn Val Thr Pro 
240 245 250 255 

AGC CCT CCA GGC ACC GTG TTG ACT TCT GCT CTC TCA CCT GTC ATC TGT 
Ser Pro Pro Gly Thr Val Leu Thr Ser Ala Leu Ser Pro Val He Cys 
260 265 270 

GGG CCC AAC AGA TCT TTC CTC AGC CTG AAG CAC ACC CCA ATG GGC AAG 
Gly Pro Asn Arg Ser Phe Leu Ser Leu Lyn His Thr Pro Met Gly Lys 
275 280 285 
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AAG TCT AGA CGG CCC AGT GCC AAG AGT ACC ATG CCT ACT AGC CTC CCT 910 

Lvs Ser Aro Arg Pro Ser Ala Lys Ser Thr Met Pro Thr Ser Leu Pro 
Y 290 295 300 

AAC CTT GCC AAG GAG GCA AAG GAT GCC AAG GGT AGT AGG AGG AAG AAG 958 
Asn Leu Ala Lys Glu Ala Lys Asp Ala Lys Gly Ser Arg Arg Lys Lys 
305 310 315 

TCT CTG AGT GAG 'AAG GTC CAA CTG TCT GAG AGT TCA GTA ACT TTA TCC 1006 
Ser Leu Ser Glu Lys Val Gin Leu Ser Glu Ser Ser Val Thr Leu Ser 
320 325. 330 335 

CCT GTT GAT TCC CTA GAA TCT CCT CAC ACG TAT GTT TCC GAC ACC ACA 1054 
Pro Val Asp Ser Leu Glu Ser Pro His Thr Tyr Val Ser Asp Thr Thr 
340 345 350 

TCC TCT CCA ATG ATT ACA TCC CCT GGG ATC TTA CAG GCC TCA CCC AAC 1102 
Ser Ser Pro Met lie Thr Ser Pro Gly He Leu Gin Ala Ser Pro Asn 
355 360 365 

CCT ATG TTG GCC ACT GCC GCC CCT CCT GCC CCA GTC CAT GCC CAG CAT 1150 
Pro Met Leu Ala Thr Ala Ala Pro Pro Ala Pro Val His Ala Gin His 
370 375 380 

GCA CTA TCT TTT TCT AAC CTT CAT GAA ATG CAG CCT TTG GCA CAT GGG 1198 
Ala Leu Ser Phe Ser Asn Leu His Glu Met Gin Pro Leu Ala His Gly 
385 390 395 

GCC AGC ACT GTG CTT CCC TCA CJG AGC CAG TTG CTA TCC CAC CAC CAC 1246 
Ala Ser Thr Val Leu Pro Ser Val Ser Gin Leu Leu Ser His His His 
400 405 410 415 

' ATT GTG TCT CCA GGC nGT GGC AGT GCT GGA AGC TTG AGT AGG . CTC CAT 1294 
He Val Ser Pro Gly Ser Gly Ser Ala Gly Ser Leu Ser Arg Leu H.s 
420 425 4 «™ 

CCA GTC CCA GTC CCA GCA GAT TGG ATG AAC CGC ATG GAG GTG AAT GAG 1342 
r Val Pro Val Pro Ala Asp Trp Met Asn Arg Met Glu Val Asn Glu 
435 440 445 
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ACC CAG TAC AAT GAG ATG TTT-GGT ATG GTC CTG GCT CCA GCT GAG GGC 1390 
Thr Gin Tyr Asn Glu Met Phe Gly Met Val Leu Ala Pro Ala Glu Gly 
450 455 460 

ACC CAT CCT GGC ATA GCT CCC CAG AGC AGG CCA CCT GAA GGG AAG CAC 1438 
Thr His Pro Gly He Ala Pro Gin Ser Arg Pro Pro Glu Gly Lys His 
465 470 475 

ATA ACC ACC CCT C'GG GAG CCC TTG CCC CCC ATT GTG ACT TTC CAG CTC 1486 
lie Thr Thr Pro Arg Glu Pro Leu Pro Pro He Val Thr Phe Gin Leu 
480 485 490 495 

ATC CCT AAA GGC AGT ATT GCC CAA CCA GCG GGG GCT CCC CAG CCT CAG 1534 
lie Pro Lys Gly Ser He Ala Gin Ala Gly Ala Pro Gin Pro Gin 
500 505 510 

TCC ACC TGC CCT CCA GCT GTT GCG GGC CCC CTG CCC ACC ATG TAC CAG 1582 
Ser Thr Cys Pro Pro Ala Val Ala Gly Pro Leu Pro Thr Met Tyr Gin 
515 520 525 

ATT CCA GAA ATG GCC CGT TTG CCC AGT GTG GCT TTC CCC ACT GCC ATG 1630 
He Pro Glu Met Ala Arg Leu Pro Ser Val Ala Phe Pro Thr Ala Met 
530 535 540 

ATG CCC CAG CAG GAC GGG CAG GTA GCT CAG ACC ATT CTC CCA GCC TAT 1678 
Met Pro Gin Gin Asp Gly Gin Val Ala Gin Thr lie Leu Pro Ala Tyr 
545 550 555 

CAT CCT TTC CCA GCC TCT GTG GGC AAG TAC CCC ACA CCC CCT TCA CAG 1726 
His Pro Phe Pro Ala Ser Val Gly Lys Tyr Pro Thr Pro Pro Ser Gin 
560 565 570 575 

CAC AGT TAT GCT TCC TCA AAT GCT GCT GAG CGA ACA CCC AGT CAC AGT 1774 
His Ser Tyr Ala Ser Ser Asn Ala Ala Glu Arg Thr Pro Ser His Ser 
580 585 590 

GGT CAC CTC CAG GGT GAG CAT CCC TAC CTG ACA CCA TCC CCA GAG TCT 1822 
Gly His Leu Gin Gly Glu His Pro Tyr Leu Thr Pro Ser Pro Glu Ser 
595 600 605 
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CCT GAC CAC-TGG TCA AGT TCA TCA CCC CAC TCT GCT TCT GAC TGG TCA 1870 
Pro Asp Gin Trp Ser Ser Ser Ser Pro His Ser Ala Ser Asp Trp Ser 
610 615 620 

GAT GTG ACC ACC AGC CCT ACC CCT GGG GGT GCT GGA GGA GGT CAG CGG 1918 
Asp Vol Thr Thr Ser Pro Thr Pro Gly Gly Ala Gly Gly Gly Gin Arg 
625 630 635 

GGA CCT GGG ACA CAC ATG TCT GAG CCA CCA CAC AAC AAC ATG CAG GTT 1966 
Gly Pro Gly Thr His Met Ser Glu Pro Pro His Asn Asn Met Gin Vol 
640 645 650 655 

TAT GCG TGAGAGAGTC CACCTCCAGT GTAGAGACAT AACTGACTTT TGTAAATGCT 2022 
Tyr Ala 

GCTGAGGAAC AAATGAAGGT CATCCGGGAG AGAAATGAAG AAATCTCTGG AGCCAGCTTC 2082 

TAGAGGTAGG AAAGAGAAGA TGTTCTTATT CAGATAATGC AAGAGAAGCA ATTCGTCAGT 2142 

TTCACTGGGT ATCTGCAAGG CTTATTGATT ATTCTAATCT AATAAGACAA GTTTGTGGAA 2202 

ATGCAAGATG AATACAAGCC TTGGGTCCAT GTTTACTCTC TTCTATTTGG AGAATAAGAT 2262 

GGATGCTTAT TGAAGCCCAG ACATTCTTGC AGCTTGGACT GCATTTTAAG CCCTGCAGGC 2322 

TTCTGCCATA TCCATGAGAA GATTCTACAC TAGCGTCCTG TTGGGAATTA TGCCCTGGAA 2382 

TTCTGCCTGA ATTGACC T AC GCATCTCCTC CTCCTTGGAC ATTCTTTTGT CTTCATTTGG 2442 

TGCTTTTGGT TTTGCACCTC TCCGTGATTG TAGCCCTACC AGCATGTTAT AGGGCAAGAC 2502 

CTTTGTGCTT TTGATCATTC TGGCCCATGA AAGCAACTTT GGTCTCCTTT CCCCTCCTGT 2562 

CTTCCCGGTA TCCCTTGGAG TCTCACAAGG TTTACTTTGG TATGGTTCTC AGCACAAACC 2622 

TTTCAAGTAT GTTGTTTCTT TGGAAAATGG ACATACTGTA TTGTGTTCTC CTGCATATAT 2682 

CATTCCTGGA GAGAGAAGGG GAGAAGAATA CTTTTCTTCA ACAAATTTTG GGGGCAGGAG 2742 

ATCCCTTCAA GAGGCTGCAC CTTAATTTTT CTTGTCTGTG TGCAGGTCTT CATATAAACT 2802 
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TTACCAGGAA GAAGGGTGTG AGTTTGTTGT TTTTCTGTGT ATGGGCCTGG TCAGTGTAAA 2862 

GTTTTATCCT TGATAGTCTA GTTACTATGA CCCTCCCCAC TTTTTTAAAA CCAGAAAAAG 2922 

GTTTGGAATG TTGGAATGAC CAAGAGACAA GTTAACTCGT GCAAGAGCCA GTTACCCACC 2982 

CACAGGTCCC CCTACTTCCT GCCAAGCATT CCATTGACTG CCTGTATGGA ACACATTTGT 3042 

CCCAGATCTG AGCATTCTAG GCCTGTTTCA CTCACTCACC CAGCATATGA AACTAGTCTT 3102 

AACTGTTGAG CCTTTCCTTT CATATCCACA GAAGACACTG TCTCAAATGT TGTACCCTTG 3162 

CCATTTAGGA CTGAACTTTC CTTAGCCCAA GGGACCCAGT GACAGTTGTC TTCCGTTTGT 3222 

CAGATGATCA GTCTCTACTG ATTATCTTGC TGCTTAAAGG CCTGCTCACC AATCTTTCTT 3282 

TCACACCGTG TGGTCCGTGT TACTGGTATA CCCAGTATGT TCTCACTGAA GACATGGACT 3342 

TTATATGTTC AAGTGCAGGA ATTGGAAAGT TGGACTTGTT TTCTATGATC CAAAACAGCC 3402 

CTATAAGAAG GTTGGAAAAG GAGGAACTAT ATAGCAGCCT TTGCTATTTT CTGCTACCAT 3462 

TTCTTTTCCT CTGAAGCGGC CATGACATTC CCTTTGGCAA CTAACGTAGA AACFCAACAG 3522 
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AACATTTTCC TTTCCTAGAG TCACCTTTTA GATGATAATG GACAACTATA GACTTGCTCA 3582 

TTGTTCAGAC - TGATTGCCCC TCACCTGAAT CCACTCTCTG TATTCATGCT CTTGGCAAT7 3642 

TCTTTGACTT TCT7TTAAGG GCAGAAGCAT TTTAGTTAAT TGTAGATAAA GAATAGTTTT 3702 

i 

CTTCCTCnC TCCTTGGGCC AGTTAATAAT TGGTCCATGG CTACACiGCA ACTTCCGTCC 3762 

AGTGCTGTGA TGCCCATGAC ACCTGCAAAA TAAGTICTGC CTGGGCATTT TGTAGATATT 3822 

AACAGGTGAA TTCCCGACTC TTTTGGTfTG AATGACAGTT C7CATTCCTT CTATGGCTGC 3882 

AAGTATGCAT CAGTGCTTCC CACTTACCTG ATTTGTGTGT CGGTGGCCCC ATATGGAAAC 3942 

CCTGCGTGTC TGTTGGCATA ATAGTTTACA AATGGTTTTT TCAGTCCTAT CCAAATTTAT 4002 

TGAACCAACA AAAATAATTA CTTCTGCCCT GAGATAAGCA GATTAAGTTT GTTCATTCTC 4062 

TGCTTTATTC TCTCCATGTG GCAACATTCT GTCAGCCTCT TTCATAGTGT GCAAACATTT 4122 

TATCATTCTA AATGGTGACT CTCTGCCCTT GGACCCATTT ATTATTCACA GATGGGGAGA 4182 

ACCTATCTGC ATGGACCCTC ACCATCCTCT GTGCAGCACA CACAGTGCAG GGAGCCAGTG 4242 

GCGATGGCGA TGACTTTCTT CCCCTG 4268 
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Potentiol signal cleavage site-^ 

MP ALRPAL LWALLALWLC CA APA HA T 

MP PL LAPLLCLALL PA LAA RG P 

MO R1GLAVLLCS LP VLT QG L 

MQSQRSRRRS RAPNTWICFW 1NKMHAVASL PASLPLLLLT LAP ANLPN I V RGTDTALVAA_ 

humN ! MLGKATCRCA SGfTGEOCQY STSHPCFVSR PCLNGGTCHM LSROT-YECT CQVGFTGKEC 

Tan-1 ; GVAOYACSCA LGFSGPLCLT PLDNAC-LTN PCRNGGTCOL LT-LTEYKCR CPPGWSGKSC 

Xen N I NAIDFICHCP VGFTDKVCLT PVDNAC-VNN PCRNGGTCEL LNSVTEYKCR CPPGWTGDSC 

Dros N i GRPGISCKCP LGFDESLCEI AVPNAC-DHV TCLNGGTCQL KT-LEEYTCA CANGYTGERC 



hum N i NLPGSYQCQC PQGFTGQYCD SLYVPCAPSP CVNGGTCRQT GDFTFECNCL PGFEGSTCER 

TAN-1 : NEVGSYRCVC RATHTGPNCE RPYVPCSPSP CQNGGTCRPT GDVTHECACL PGFTGQNCEE 

Xen N • NEFGSYRCTC QNRFTGRNCD EPYVPCNPSP CLNGGTCRQT DDTSYDCTCL PGFSGQNCEE 

Dros N = NTHGSYQCMC PTGYTGKOCD TKYNPCSPSP CQNAG1CRSN G-LSYECKCP KGFEGKNCEQ 



I f EGF-I ike Repeats 

'. . QCROGYEPCV NEGMCVTYHN GTGYCKCPEG FLGEYCQHRO PCE-KNRCQN GGTC-VAQA 83 

RCSCPGETCL NGGKCEA-AN GTEACVCGGA FVGPRCQOPN PCL-STPCKN AGTCHWDRR 80 

■ RCTQTAEMCL NGGRCEMTPG GTGVCLCGNL YFGERCQFPN PCT1KNQCMN FGTCEPVLQG 90 

I SCTSVG-CQ NGGTCVTQLN GKTYCACOSH YVGDYCEHRN PCN-SMRCQN GGTCQVTFRN 1 1 7 



QWTDACLSHP CANGSTCTTV -ANQF^KC LTGFTGQKCE TDVNEC-OIP GHCQHGGTCL 199 

QQADPCASNP CANGGQCLPF — EASYICHC PPSFHGPTCR QOVNECGQKP RLCRHGGTCH 196 

QQADPCASNP CANGGKCLPF -EIQYICKC PPGFHGATCK QDINEC-S-0 NPCKNGGX1 195 

ETKNLCASSP CRNGATCTAL AGSSSFTCSC PPCFTGDTCS YDIEEC-Q-S NPCKYGGICV 233 



NIDDCPNHRC QNGGVCVOGV NTYNCRCPPQ WTGQFCTECV OECLLQPNA- CQNGGTCANR 318 

NIDDCPGNNC KNXACVOGV NTYNCPCPPE WTGQYCTEDV DECQLMPNA- CQNGGTCHNT 315 

NIOOCPSNNC RNGGTCVOGV NTYNCQCPPD WTGQYCTEDV DECQLMPNA- CQNGGTCHNT 314 

NYDDCLGHLC CtCGTCIDGI SDYTCRCPPN FTGRFCQDDV DECAQRDHPV CQNGATCTNT 352 



hum N 
TAN-1 
Xen N 
Dros N 
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hum N i NGGYGCVCVN GWSGDDCSEN IDOCAFASCT PGSTCIDRVA SFSCMCPEGK AGLLCHLDDA 

TAN-1 i HGGYNCVCVN GWTGEDCSEN IDDCASAACF HGATCHDRVA SFYCECPHGR TGLLCHLNDA 

Xen N YGGYNCVCVN GWTGEOCSEN IDOCANAACH SGATCHDRVA SFYCECPHGR TGLLCHLDNA 

DrosN 1HGSYSCICVN GWACLDCSNN TDDCKQAACF YGATC1DGVG SFYCQCTKGK TGLLCHLDDA 

h uffl N • AFHCECLKGY AGPRCEMOIN ECHSDPCQND ATCLDKIGGF TCLCMPGFKG VHCELEINEC 

TAN-1 ISFECQCLQGY TGPRCEIDVN ECVSNPCQND ATCLDQIGEF QCMCMPGYEG VHCEVNTDEC 

Xen N : SFQCNCPQGY AGPRCEIDVN ECLSNPCQNO STCLDQIGEF QC1CMPGYEG LYCETNIOEC 

Dros N : SYRCNCSQGF TGPRCETNIN ECESHPCQNE GSCLDOPGTF RCVOffGFTG TQCE1DIDEC 



hum N ATGFTGVLCE ENiONCDPOP CHHGQCQOGI OSYTCICNPG YMGAieSOQI OECYSSPCLN 

TAN-1 TEGYTGTHCE VOIDECOPOP CHYGSCKDGV ATFTCLCRPG YTGHHCETN1 NECSSQPCRL 

Xen N TEGFTGRHCE QOINECIPOP CHYGTCKOGI ATFTCLCRPG YTGRLCONOI NECLSKPCLN 

Dros N PPGYTGTSCE ININDCDSNP CHRGKCIDDV NSFKCLCDPG YTGYiCQKQl NECESNPCQF 



CISNPCHKGA LCOTNPLNGO YICTCPQGYK GADCTEDVDE CAMANSNPCE HAGKCVNlDG ! 438 

CISNPCNEGS NCDTNPVNGK A1CTCPSGYT GPACSQOVOE CSLG-ANPCE HAGKCINTLG ! 434 

CISNPCNEGS NCDTNPVNGK A1CTCPPGYT GPACNNDVOE CSLG-ANPCE HGGRCTNTLG > 433 

CTSNPCHADA ICDTSPINGS YACSCATGYK GVDCSEDIDE CDQG-SPCE HNG1CVNTPG : 470 



QSNPCVNNGQ CVDKVNRFQC LCPPGFTGPV CQIDIDOCSS TPCLNGAXCI DHPNGYECQC 558 

ASSPCLHNGR CLDKINEFQC ECPTGFTGHL CQYDVDECAS TPCKNGAKCL DGPNTYTCVC 554 

ASNPCLHNGK CIDKINEFRC DCPTGFSGNL CQHDFDECTS TPCKNGAKCL DGPNSYTCQC 553 

QSNPCLNDGT CHDK1NGFKC SCALGFTGAR CQ1N1DDCQS QPCRNRGICH DS1AGYSCEC 590 

DGRCIOLVNG YQCNCQPGTS GVNCEINFDD CASNPCIHG- 1CMDGINRYS CVCSPGFTGQ 677 

RGTCQDPONA YLCFCLKGTT GPNCE1NLD0 CASSPCDSG- TCLDK1DGYE CACEPGYTGS 673 

GGQCTORENG YICTCPKGTT GVNCETK100 CASNLCDNG- KCIOKIDGYE CTCEPGYTGK 672 

DGHCQORVGS YYCQCQAGTS GKNCEVNVNE CHSNPCNNGA TC10G1NSYK CQCVPGFTGQ 710 
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hum N ! RCNIDIDECA SNPCRKGATC INGVNGFRCI CPEGPHHPSC YSQVNECLSN PCI-HGNCTG 

TAN-1 : MCNSNIDECA GNPCHNGGTC EDGINGFTCR CPEGYHDPTC LSEVNECNSN PCV-HGACRD 

Xen N : LCN1NINECD SNPCRNGGTC K0Q1NGFTCV CPOGYHDHMC LSEVNECNSN PC1-HGACH0 

Oros N i HCEKNVDEC1 SSPC ANNCVC 1DQVNGYKCE CPRCFYDAHC LSDVDECASN PCVNEGRCED 

humN i OECASNPCLN QGTCFDDISG YTCHCVLPYT GKNCQTVLAP CSPNPCENAA VCKESPNFES 

TAN-1 ! NECASNPCLN KGTC I DOVAG YKCNCLLPYT GATCEWLAP CAPSPCRNGG ECRQSEOYES 

Xen N ■■ NECSSNPCLN HGTCIDOVAG YKCNCMLPYT GAICEAVLAP CAGSPCKNGG RCKESEDFET 

Dros N i DDCVTNPCGN CCTCIOKVNG YKCVCKVPFT GRDCESKMDP CASNRCKNEA KCTPSSNFLD 

humN CLANPCQNGG SOCGVNTFS CLCLPGFTGD KCQTDNMECL SEPCKNGGTC SOYVNSYTCK 

TAN-1 CRPNPCHNGG SC T DG INTAF COCLPGFRGT FCEE01NECA SOPCRNGANC TDCVOSYTCT 

Xen N . CQPNPCKNGG SCSDGIHTF CNCPAGFRGP KCEEDINECA SNPCKNGANC TDCVNSYTCT 

Dros N ' CASFPCQNGG TCLDG IGDYS CLCVDGFDGK HCETDINECL SQPCQNGATC SQYVNSYTCT 



r 



r 



GLSGYKCLCD AGWVGINCEV DKN£CLSNPC QNGGTCONLV NGYRCTCKKG FKGYNCQVNi 796 

SLNGYKCDCD PGWSGTNCDl NNNECESNPC VNGGTCKOMT SG1VCTCREG FSGPNCQTNI 792 

GVNGYKCDCE AGWSGSNCDI NNNECESNPC MNGGTCKDMT GAYICTCKAG FSGPNCQTNI 791 

GINEF1CHCP PGYTGKRCEL D1DECSSNPC QHGGTCYDKL NAFSCQOffG YTGQKCETN1 830 

YTCLCA-PGW QGQRCT1DID EC-ISKPCMN HGLCHNTQGS YMCECPPGFS GMDCEEDIDD 914 

FSCVCPTAGA KGQTCEVDIN EC-VLSPCRH GASCQNTHGG YRCHCQAGYS GRNCETDIDD 911 

FSCECP-PGW QGQTCE1DMN EC-VNRPCRN GATCQNTNGS YKCNCKPGYT GRNCEMDIDD 909 

FSCTCK-LCY TGRYCDED1D ECSLSSPCRN GASCLNVPGS YRCLCTKGYE GR0CA1NTDD 949 

CQAGFDGVHC ENNINECTES SCFNGGTCVO G1NSFSCLCP VGFTGSFCLH EINECSSHPC 1034 
CPAGFSGIHC ENNTPDCTES SCFNGGTCVO G1NSFTCLCP PGFTGSYCQH WNECDSRPC' 1031 

CQPGFSGIHC ESNTPDCTES SCFNGGTCID GINTFTCQCP PGFTGSYCQH DINECDSKPC 1029 

CPLGFSGINC QTNDEDCTES SCLNGGSCID GINGYNCSCL AGYSGANCQY KLNKCDSNPC 1069 
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hum N . LNEGTCVDGL GTYRCSCPLG YTGKNCQTLV NLCSRSPCKN KGTCVQKKAE SQCLCPSGWA 

TAN-1 : LLGGTCQDGR GLHRCTCPQG YTGPNCQNLV HWCOSSPCKN GGKCWQTHTQ YRCECPSGWT 

Xen N : LNGGTCQDSY GTYKCTCPQG YTGLNCONLV RWCOSSPCKN GGKCWQTNNF YRCECKSGWT 

Dros N : LNGATCHEQN NEYTCHCPSG FTGKCCSEYV DWCCQSPCEN GATCSQMKHQ FSCKCSAGWT 

hum N iSNPCQHGATC SDFlGGYRCE CVPGYGGVNC EYEVDECQNQ PCQNGGTCID LVNHFKCSCP 

TAN-1 iPSPCQNGATC TDYLGGYSCK CVAGYHGVNC SEEIDECLSH PCQNGGTCLD LPNTYKCSCP 

Xen N IPNPCQNGATC TDYLGGYSCE CVAGYHGVNC SEEINECLSH PCQNGGTCID LINTYKCSCP 

Dros N iSQPCQNGGTC RDL1GAYECQ CRQGFCCQNC ELNIDDCAPN PCQNGGTCHD RVWFSCSCP 

humN ICLSNPCSSEG SLDC1QLTND YLCVCRSAFT GRHCETFVDV CPOPCLNGG TCAVASNWPD 

TAN-1 CLSNPCDARG TQNCVQRVND FHCECRAGHT GRRCESVING CKGKPCKNGG TCAVASNTAR 

Xen N CLSNPCDSRG TONCIQLVND YRCECRQGFT GRRCESWDG CKGMPCRNGG TCAVASNTER 

Dros N ;CLSNPCSNAG TLDCVQLVNN YHCNCRPGHM GRHCEHKVDF CAQSPCQNGG NCN1 — RQS 



J 



C GAYCDVPNVS C01AASRRGV LVEHLCQHSG VC1NAGNTHY CQCPLGYTGS YCEEQLDECA 1154 

GLYCDVPSVS CEVAAQRQGV DVARLCQHGG LCVDAGNTHH CRCQAGYTGS YCEDLVOECS 1151 

GVYCDVPSVS CEVAAKQQGV D1VHLCRNSG MCVDTGNTHF CRCQAGYTGS YCEEQVDECS 1149 

GKLCDVQTIS CQDAADRKGL SLRQLC-NNG TCKOYGNSHV CYCSQGYAGS YCQKEIDECQ 1188 



PGTRGLLCEE NIODCAR GPHCLN GGCCK0R1GG YSCRCLPGFA GERCEGD1NE 1267 

RGTQGVHCEI NVODCNPPVD PVSRSPKCFN NGTCVDQVGG YSCTCPPGFV GERCEGDVNE 1271 

RGTQGVHCEI NVDDCTPFYD SFTLEPKCFN NGKCIDRVGG YNC1CPPGFV GERCEGDVNE 1269 

PGTMGIICEI NKDDCKP GACHN NGSCIDRVGG FECVCQPGFV GARCEGDINE 1300 



GFICRCPPGF SGARCQS — SCGQVKCRKG ECCVHTAS- GPRCFCPSP- — RDCES — 1376 

GFICKCPAGF EGATCENDAR TCGSLRCLNG GTCISGPR- SPTCLCLGPF TGPECQFPAS 1389 

GFICKCPPGF DGATCEYDSR TCSNLRCQNG GTCISVLT— SSKCVCSEGY TGATCQYPVI 1387 

GHHCICNNGF YGKNCELSGQ DCDSNPCRVG -NCWADEGF GYRCECPRGT LGEHCE1DTL 1415 

FIG. 1 3D 
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hum N -GC-ASSPCO HGGSCHPQRQ PPYYSCQCAP PFSGSRCEL- -YTAPP S TPP 

TAN-1 SPCLGGNPCY NQGTCEPTSE SPFYRCLCPA KFNGLLCHIL DYSFGG GAGRD1PPP 

Xen N SPC-ASHPCY NGGTCQFFAE EPFFQCFCPK NFNGLFCHIL DYEFPG GLGKNITPP 

Dros N iPEC-SPNPCA QGAACEDLLG D-YECLCPS KWKGKRCDIi Y DANYPGWNGG SGSGNORYAA 

hum N NN-QCDELCN TVECLFDNFE CQGNSKTCK- -YDKYCADHF KONHCNQGCN SEECGWOGLD 

TAN-1 :SDGHCOSQCN SAGCLFDGFD CQRAEGQCNP LYDOYCKDHF SDGHCOQGCN SAECEWDGLD 

Xen N INDGKCDSQCN NTGCLYDGFD CQKVEVQCNP LYDOYCKDHF GDGHCDQGCN NAECFJDGLD 

Dros N 'KNGKCNEECN NAACHYDGHO CERKLKSCOS LFDAYCQKHY GDGFCDYGCN NAECSWDCLO 

hum N YYGEKSAAMK KQ— R MTRRSL PGEQ E QEVAGSKVFl 

TAN-1 YYGREEELRK HP1KRAAEGW AAPOALLGQV KASLLPGGSE GGRRRRELDP MDVRGSIVYL 

Xen N YYGNEEELKK HHIKRSTDYW SDAPSAI FSMESIL LGRHRRELDE ME VRGS I VYL 

Dros N WKDNVRVPEI EDTDFARKNK ILYTOOVHQ- TGIQIYL 



I 

LNR (Notch/Lin-12 Repeots) 



v. 



FIG.13E 



^_A_TCL SQYCADKARD GVCOEACNSH ACQWDGGDCS LTMENPWANC SSPLPCWDYI 1476 

LIEE — ACE LPECQEDAGN KVCSLQCNNH ACGWDGGDCS LNFNOPWKNC TQSLGCWKYF 1501 

DNDD — ICE NEQCSELADN KVCNANCNNH ACGWDGGDCS LNFNOPWKNC TQSLQCWKYF 1498 

DLEQQRA MCO KRGCTEKQGN GICOSDCNTY ACNFDGNDCS LG1-NPWANC TAN-EXWNKF 1531 

CAADQPEN-L AEGTLVIWL MPPEQLLQDA R-SFLRALGT LLHTNLRIKR DSQGELMVYP 1591 

CAEHVPER-L AAGTL-WW LMPPEQLRNS SFHFLRELSR VLHTNWFKR DAHGQQMIFP 1619 

-< C-WEN-L AEGTLVLWL IWPPERLKNNS V-NFLRELSR VlHTNWFKK DSKGEYKIYP 1615 

jBKTQSPVL AEGAMSWML WVEAFREIQ A-QFLRNMSH MLRTTVRLKK OALGHO 1 1 1 N 1650 

JM 

EIDNRQCVQD SDHCFKNTDA AAALLASHAI QG — TLSYP LVSWSESLT PERT-Q-LLY 1680 

EIDNRQCVQA SSQCFQSATD VAAFLGALAS LGSL-NIPYK IEAVQSETVE PPPPAQ-LHF 1737 

E1DNRQCYKS SSQCFNSATD VAAFLGALAS LGSLDTLSYK IEAVKSEN&E TPKPST-LYP 1730 

EIDNRKCTEC FTHAVEAAEF LAATAAKHQL RNDFQ-IHSV RG1KNPGDED NGEPPANVKY 1745 
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hum N LLAVAWIIL FIJLLGVIMA KRKRK-HGS LWLPEGFTLR RDASNHKRRE PVGQOAVGLK 

TAN-1 •MYVAAAAFVL LFFVGCGVLL SRKRRRQHGQ LWFPEGFKV- SEASKKKRRE ELGEDSVGLK 

Xen N iMLSMLVIPLL IIFVFMWIV NKKRRREHDS FGSPTALFQK NPA-KRNGET PW-EOSVGLK 

Dros N IV1TG11LVI1 ALAFFGMVL1 - STQRKRAHGV TWFPEGFRAP AAVMSRRRRO PHGQEMRNLN 

CDC-10/Ankyrin Repeots 

hum N PIORRPWTQQ HLEAAOIRRT PSLALTPPQA EQEVDVLDVN VRGPOGCTPL MLASLRGGSS 

TAN-1 QTDHRQWTQQ HLDAADL-RM SAMAPTPPQG EVDADCMOVN VRGPDGFTPL MIASCSGGGL > 

Xen N KTDPRQWTRQ HLDAADL-RI SSMAPTPPQG EIEADCMDVN VRGPDGFTPL MIASCSGGGL 

Oros N EADQRVWSQA HLDWDV-R- AIM-TPP-A HQOGGKHDVD ARGPCGLTPL MIAAVRGGGL 



hum N ANAQOWGRC PLHAAVAADA QGVFQIL IRN RVTDLDARMN DGTTPL I LAA RLAVEGMVAE 

TAN-1 ANIC3DNM3RT PLHAAVSADA QGVFQIL IRN RATDLDARMH DGTTPL I LAA RLAVEGMLED 

Xen N :ANVQ0NM3RT PLHAAVAADA QGVFQIL IRN RATDLDARMF DGTTPL I LAA RLAVEGMVEE 

Dros N ! ANCQDNTGRT PLHAAVAADA MGVFQILLRN RATNLNARW DGTTPL I LAA RLAIEGMVEO 



r 



NLSVQVSEAN LIGTGTSEHW VDDE G PQPKKVKAED EALLSE-EDD 1782 

PLK-NASDGA LMDDNQNE-W GDED LETKKFRFEE PWLPD-LDD 1837 

PIK-NMTDGS FMDDNQNE-W GDEET LENKRFRFEE QVILPELVDD 1831 

KQVAMQSQGV GQPGAH — W SDDESDMPLP KRQRSDPVSG VGLGNNGGYA SDHTMVSEYE 1861 



DLSDEDEDAE DSSANIITDL VYQGASLQAQ TDRTGEMALH LAARYSRADA AKRLLDAGAD 1902 

ETGNSEEE-E DAPA-VISDF IYQGASLHNQ TDRTGETALH LAARYSRSDA AKRLLEASAD 1954 

ETGNSEEE-E DASANMiSDF IGQGAQLHNQ TDRTGETALH LAARYARADA AKRLLESSAD 1949 

DTGEDIENNE DSTAQV1SDL LAQGAELNAT MDKTCETSLH LAARFARADA AKRLLDAGAD 1976 



LINCQADVNA VDDHGKSALH WAAAVNNVEA TLLLLKNGAN ROMQDNKEET PLFLAAREGS 2022 

LINSHADVNA VDDLGKSALH WAAAVNNVDA AWLLKNGAN KDMQNNREET PLFLAAREGS 2074 

LINAHADVNA VDEFGKSALH WAAAVNNVDA AAVLLKNSAN KDMQNNKEET SLFLAAREGS 2069 

LITADADINA ADNSGKTALH WAAAVNNTEA VN1LLMHHAN RDAQDDKDET PLFLAAREGS 2096 
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hum N YEAAKILLDH FANRD I TDHW ORLPROVARD RMHHDIVRLL DEYNVTPSPP — GTVL — TS 

TAN -1 YETAKVLLDH FANRDI TOHM DRLPROIAOE RMHHDIVRLL DEYNLVRSPO LHGAPLGGTP 

Xen N ! YETAKVLLDH YANRD1TDHM DRLPROIAOE RMHHDIVHLL DEYNLVKSPT LHNGPLGAT- 

Dros N IYEACKALLDN FANR E1TDHM DRLPRDVASE RLHHDIVRLL DE-HVPRSPO MLSMTPQAMI 

NLS CK II cdc2 cdc2 



hum N GSRRKKSLSE KVQLSE-SS VTL5PVDSLE iSPHTYVSDTT iSSPM 

TAN-1 A-RRKKSQOG KGCLLD— SS GML5PVDSLE ISPHGYLSOVA ISPPLi 

Xen N A-ERKKSQOG KTTLLDSGSS GVLSPVQSLE iSTHGYLSDVS iSPR. 

Dros N GS-PDNGLDA TGSURRKASS KKTSAASKKA ANLNGLNPGO LTGGV6GVPG VPPTNSAAQA 
BNTS : 

hum N IT5PGIL0AS PNPML-ATA APPAPVHAQH 

TAN-1 LRSPF— QOS PSVPLNHLPG MPDTHLGIGH 

Xen N — MTSPF—flQS PSMRNHLTS MPESQLGMNH 

Dros N 'YEDCIKNAQS MQSLQGNGLO MIKLONYAYS MGSPf-Qa LLNGQGLGMN GNGQRNGVGP 

CK II cdc2 J 



ALSPV ICGP NRSFLSLKHT PMGKKSRRPS AKSTWTSLP NLAKEAKDAK 2127 

TLSPP LCSP NGYLGSLKPG VQGKKVRKPS SKGLACGS KEAKDLK 2178 

TLSPP |CSP NGYMGNMKPS VQSKKARKPS IKGNGC KEAKELK 2170 

GSPPPGQQOP aiTQPTVIS AGNGGNNGNG NASGKQSNQT AKQKAA KKAKLIE 2208 



2169 

2219 

2213 

AAAAAAAVAA MSHELEGSPV GVGMGGNLPS PYOTSSMYSN AMAAPLANGN PNTGAKQPPS: 2327 

ALSFSNLHEM Q -PLAHGASTV IPSVSGUSH HHIVSPGS- 2235 

LNVAA-KPEM AALGGGGRLA FETGPPRLSH LPVASGTSTV LGSSSGGALN FTVGGSTSLN 2306 

INMAT-KQEM AA— GSNRMA FDAMVPRLTH L-NASSPNTI MS — NGSMH FTVGGAPTMN 2294 

GVLPGGLCGM GGLSGAGNGN SHEQGLSPPY SNQSPPHSVQ SSLALSPHAY LGSPSPAKSR 2445 
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hum N GSAGSLSRLH PVPVPADW- MNRKCVNETQ YNEMFGMVLA PAEG-THPGI APQSRPPEGK 

TAN-1 GQCEWLSRLO SGMVPNQYNP LRGSVAPGPL 5TQAPSLQHG -MVGPLHSSL AASALSGMrfS 

Xen N SQCDWLARLO NGMVQNQYDP IRNGIQQGN- AQQAQALQHG LMTS-LHNGL PATTLSQWT 

Oros N PSLPTSPTHI QAMRHATQQK QFGGSNLNSL .LGGANGGGW GGGGGGGGGV GQGPONSPVS 

hum N APQPOSTCPP AVAGPLPTMY QIP EM ARL-PSVAFP TAMTOXJQ VAQTILPAYH 

TAN-1 PPQPHLGVSS AASGHLGRSF LSGEPSQADV OPLGPSSLAV HTILPQ-ESP ALPTSLPSSL 

Xen N MQQQHHN-SS TTSTHINSPF CSSDISQTDL QQM— SSNNI HSVMPQ-DTQ IFAASLPSNL 

Dros N QQQLGGLEFG SAGLDLNG-F CGSPDSFHSG QMNPPS — I OSSMSG-SSP STNMLSPSSQ 

hum N : SOWSDVTTSP TRGGAGGGQR GPGTHMSEPPHNN MQVYA 

TAN-1 I SDWSEGVSSP PT SMQ SQIARIPEAFK 

Xen N ! SOWSEGISSP PT SMQ PQRTHIPEAFK 

Dros N : SDWSEGVQSP AANNLYISGG H0ANKGSEAIY1 



r 



HITTPRE PLPP-IV-TF QLIPKGSIAQ PAG 2320 

■ YQGLPSTRL ATQPHLVQTQ QVQPQNLGMQ QQNLQPANIQ QQQSLQPPPP 2414 

— YQAMPNTRL ANQPHLMQAQ QMQQQQN LQLHQS 2384 

LGI ISPTGSD MGIMLAPPQS SKNSAIMQTI SPOQQQQQQO QQQQQHQQQQ QQQQQQQQQQ 2565 



PEST -contoining Region 



PFPASVGKYP!TPPSQHSYAS SNAAERTPSH SGHLQGEHPY LTPSPESPDQ WSSSSPHSA- 2433 

VPPVTAAQFL ITPPSQHSY-S S-PVENTPSH QLQVP-EGPF LTPSPESPDQ WSSSSPHSNV 2530 

TQSMTTAQFL ITPPSQHSY-S S-PMDNTPSH QLQVP-DHPF LTPSPESPDQ WSSSSPHSNV 2497 

HNQQAFYQYL ITPSSQHS GGHTPQH LVQTL-D-SY PTPSPESPGH WSSSSPRSN- 2671 



2471 
2556 
2523 
2703 
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10 20 30 40 50 60 70 80 90 

♦ * ♦ $ t • $ $ * 

GGAATTCCCC CCGCCCTGCG CCCCGCTCTG CIGTGGGCGC TCCTGGCGCT CTGGCTGTGC TGCGCGGCCC CCGCGCATGC ATTGCAGTGT 
P A L R PAL I W A L L A L W L C C A A PAHA LOO 

100 110 120 130 140 150 160 170 180 

♦ I t * * « t » » 

CGAGATGGCT ATGAACCCTG TGTAAATGAA GGAATGTGTG TTACCTACCA CAATGGCACA GGATACTGCA AATGTCCAGA AGGCTTCTTG 
R D G Y E P C V N E CMC V T Y H N G T G Y C K C P E G F L> 

190 • 200 210 220 230 240 250 260 270 

♦ » » « * i ♦ » » 

GGGGAATATT GICAACATCG AGACCCCTGT GAGAAGAACC GCTGCCAGAA TGGTGGGACT TGTGTGGCCC AGGCCATGCT GGGGAAAGCC 
GEY COHR OPC EKN RCQN GGT CVA QAML GKA> 

280 290 300 310 320 330 340 350 360 

♦ » »»#»*»• 

ACGTGCCGAT GTGCCTCAGG GTTTACAGGA GAGGACi'GCC AGTACICAAC ATCTCATCCA TGCTTTGTGT CTCCACCCTG CCTGAATGGC 
ICR C A S G F T G E D C Q Y S T S H P C F V S R P C L N G> 

370 380 390 400 410 420 430 440 450 

GGCACATGCC ATATGCTCAG CCGGGAIACC TATGAGTGCA CCTGTCAAGT CGGGTTTACA GGTAAGGAGT GCCAATGGAC GCATGCCTGC 
G T C H U L S R D T Y E C T C 0 V G F T G K E C O H T D A C> 

460 470 480 490 500 510 520 530 540 
**••••*»* 

CTGTCTCATC CCTCTGCAAA TGGAAGTACC TGTACCACTG TGGCCAACCA GTTCTCCTGC AAATGCCTCA CAGGCTTCAC AGGGCAGAM 
LSH PCAN GST CTT VANQ FSC KCL TGFT GOK> 

550 560 570 580 590 600 610 620 630 

TGTGAGACTG ATGTCAATGA GTGTGACATT CCAGGACACT GCCAGCATGG TGGCACCTGC CTCAACCTGC CTGGTTCCTA CCAGTGCCAG 
CEI DVNE COI PGH CQHG CTC LNL PGSY QC0> 

" 640 650 660 670 680 690 700 710 720 

t * t » * t t « * 

TGCCCTCAGC GCTTCACAGG CCACTACTGT GACAGCCTGT ATCTGCCCTG TCCACCCTCA CCTTGTGTCA ATGGAGGCAC CTGTCGGCAG 
CPQ GFTG QYC OSL YVPC APS PCV N G G I CRO> 

730 740 750 760 770 780 790 800 810 

t t t » t t t « * 

ACTGGTGACT TCACTTTTGA GTGCAACTGC CTTCCAGGTT TTGAAGGGAG CACCTGTGAG AGGAATATTG ATGACTGCCC TAACCACAGG 
T C D FIFE C N C L P G F E G S T C E R N I 0 0 C P N H R> 
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820 830 840 850 860 870 880 890 900 
« « *♦•»♦»« 

TGTCAGAATG GAGGGGTTTG TGTGGATGGG GTCAACACTT ACAACTGCCG CIGTCCCCCA CAAIGGACAG GACAGTTCTG CACAGAGGAT 
CON G G V C V D G V N T Y N C R C P P 0 W I G Q F C T E 0> 

910 920 930 940 950 960 970 980 990 
» ♦ »»»»*» f 

GTGGATGAAT GCCTGCTGCA GCCCAATGCC IGTCAAAATG GGGGCACCTG TGCCAAOCGC AATGGAGGCT ATGGCTGTGT ATGTGTCAAC 
VOE CLLQ PNA CQN GGTC A N R NGG YGCV C V N> 

1000 1010 1 020 1030 1040 1050 1060 1070 1080 

GGCTGGAGTG GAGATGACTG CAGTGAGAAC ATTGATGATT GIGCCTTCGC CTCCTGTACT CCAGGCTCCA CCTGCATCGA CCCTGTGGCC 
GWS GOOC SEN 100 C A F A SCT PGS TCID RVA> 

1090 1100 1110 1120 1130 1140 1150 1160 1170 

TCCTTCTCTT GCATGTGCCC AGAGGGGAAG GCAGGTCTCC TCTCTCATCT GGATGATGCA TGCATCAGCA ATCCTTGCCA CAAGGGGGCA 
SFS CMCP EGK AGL LCHL OOA CIS NPCH KGA> 

1180 1190 1200 1210 1220 1230 1240 1250 1260 

♦ ♦ «»♦♦•♦♦ 

CIGTGTGACA CCAACCCCCT AAATGGGCAA TATATTTGCA CCTGCCCACA AGGCTACAAA GGGGCIGACT GCACAGAAGA TGTGGATGAA 
LCD TNPL NGQ YIC TCPO GYK GAD CTED VDE> 

1270 1280 1290 1300 1310 1320 1330 1340 1350 

♦ » ♦♦•««»» 

TGTGCCATGG CCAATAGCAA TCCTTGTGAG CATGCAGGAA AATCTGTGAA CACGGAIGGC GCCTTCCACT GTGAGTGTCI GAAGGGTTAT 
CAM ANSN PCE HAG KCVN TOG AFH CECL K G Y> 

1360 1370 1380 1390 1400 1410 1420 1430 1440 
********* 
GCAGGACCTC GTTGTGAGAT GGACATCAAT GAGTGCCATT CAGACCCCTG CCAGAATCAT GCTACCTGTC TGGATAAGAT TGGAGGCTTC 
AGP RCEM OIN ECH SOPC QNO ATC IDKI GGF> 

-1450 1460 1470 1480 1490 1500 1510 1520 1530 

♦ •»•»<««« 

ACATGICTGT GCATGCCAGG TTTCAAAGGT GTGCAITGTG AATTAGAAAT AAATGAATGT CAGAGCAACC CTTGTGTGAA CAATGGGCAG 
T C L C M P G F K G V H C E L E I NEC 0 S N P C V N N G 0> 

1540 1550 1560 1570 1580 1590 1600 1610 1620 

$ t | | * 1 4 | ♦ 

TGTGTGGATA AAGTCAATCG TTTCCAGTGC CTGTGTCCTC CTGGTTTCAC TGGGCCAGTT TGCCAGATTG ATATTGATGA CTGTTCCAGT 
CVD KVNR FOC LCP PGFT GPV COI OIOD CSS> 
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1630 1640 1650 1660 1670 1680 1690 1700 1710 
»««*«*»»« 

ACTCCGTGTC TGAATGGGGC AAAGIGTATC GATCACCCGA AIGGCTAIGA ATGCCAGTGT GCCACAGGTT TCACTGGTGT GTTGTGTGAG 
I P C L N G A K C I 0 H P N G Y E C Q C A I G F T G V L C E> 

1720 1730 1740 1750 1760 1770 1780 1790 1800 

GACAACATTG ACAACTGTGA CCCCGATCCT TGCCACCATG GTCAGTGTCA GGATGGTATT GATTCCTACA CCTGCATCTC CAATCCCGGG 
E N I D N C 0 POP C H H G 0 C 0 D G I 0 S Y T C I C N P G> 

i 

1810 1820 1830 1840 1850 1860 1870 1880 1890 
»»»«♦*»»» 

TACATGGGCG CCATCTGCAG IGACCAGATT GATGAATGTT ACAGCAGCCC TTGCCTGAAC GATGGTCGCT GCATTGACCT GGTCAATGGC 
y M G A I C S D Q I DEC Y S S P C L N D G R C I 0 L V N G> 

1900 1910 1920 1930 1940 -1950 1960 1970 1980 
*«»>*»*»» 
TACCAGTGCA ACTGCCAGCC AGGCACGTCA GGGGTTAATT GTGAAATTAA TTTTGATGAC TGTGCAAGTA ACCCTTGTAT CCATGGAATC 
YOC NCQP CTS CVN CEIN FDD CAS NPCI HG1> 

1990 2000 2010 2020 2030 2040 2050 2060 2070 
»♦»•*«»»♦ 
TGTATGGATG GCATTAATCG CTACAGTTGT GICTGCTCAC CAGGATTCAC AGGGCAGAGA TGTAACATTG ACATTGATGA GTGTGCCTCC 
C M 0 G I N R Y S C V C S P G F T G Q R C N I D I D E C A S> 

2080 2090 2100 2110 2120 2130 2140 2150 2160 

♦ »»»♦♦♦»♦ 

AATCCCTGTC GCAAGGGTGC AACATGTATC AACGGTGTGA ATGGTTTCCG CTGTATATGC CCCGAGGGAC CCCATCACCC CAGCTGCTAC 
NPC RKGA TCI NGV NGFR CIC PEG PHHP SCY> 

2170 2180 2190 2200 2210 222C 2230 2240 2250 
■ t ♦ ♦ ♦ * » * * » 

TCACAGGTGA ACGAATGCCT GAGCMTCCC TGCATCCAIG GAAACTGTAC TGGAGGTCTC AGTGGATATA AGTGTCTCTG TGATGCAGGC 
SQV NECL SNP CIH GNCT GCL SGY KCLC 0 A G> 

"2260 2270 2280 2290 .2300 2310 2320 2330 2340 

• » i • * » • • * 
• TGGGTTGGCA TCAACTGTGA AC1GGACAAA AATGAATGCC TTTCGAATCC ATGCCAGAAT GGAGGAACTT GTGACAATCI GGTGAATGGA 

WVG INCE VOK NEC LSNP CON GGT CONL VNG> 

2350 2360 2370 2380 2390 2400 2410 2420 2430 

TACAGGTGTA CTTGCAAGAA GGGCTTFAAA GGCTATAACT GCCAGGIGAA TATTGATGAA TGTGCCTCAA ATCCATGCCT GAACCAAGGA 
Y R C T C K F G F K G Y N C 0 V N I 0 E CAS N P C L N Q G> 
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2440 2450 2460 2470 2480 2490 2500 2510 2520 
» » t « • » » * » 

ACCTGCTTTG ATCACATAAG TGGCTACACT TGCCACTGTG TGCTGCCATA CACAGGCAAG AATTGTCAGA CAGTATTGGC TCCCTGTTCC 
T C F 0 D I S G Y T C H C V L P Y T G K N C 0 T V L A P C S> 

2530 2540 2550 2560 2570 2580 2590 2600 2610 

♦ » ♦ ♦ ♦ ♦ ♦ * * 
CCAAACCCTT GTGAGAATGC TGCTGTTTGC AAAGAGTCAC CAAATTTTGA GAGTTAIACT TGCTTCTGTC CTCCTGGCTG GCAAGGTCAG 

P N P C E N A A V C K E S P N F E S Y T C L C A P G W 0 G Q> 

2620 2630 2640 2650 2660 2670 -2680 2690 2700 
, » » ♦ » ♦ ♦ 

CGGTGIACCA TTGACATTGA CCAGTGTATC TCCAAGCCCT GCATGAACCA TGGTCTCTGC CATAACACCC AGGGCAGCTA CATGTGTGAA 
R C T 10 10 E C I S K P C M N H . G L C H N T 0 G S Y M C E> 

2710 2720 2730 2740 2750 2760 2770 2780 2790 
»»»♦»*«*♦ 
TGTCCACCAG GCTTCAGTGG TATGGACTGT CAGGAGGACA TTGATGACTG CCTTGCCAAT CCTTGCCAGA ATGGAGGTTC CTGTATGGAT 
C P P G F S G M 0 C E E D I D D C LAN P C 0 N G G S C M D> 

2800 2810 2820 2830 2840 2850 2860 2870 2880 
»»»»*♦*** 
GGAGTGAATA CTTTCTCCTG CCTCTGCCTT CGGGGTTTCA CTGGGGATAA GTGCCAGACA'GACATGAATC AGTGTCTGAG TGAACCCTGT 
G V N T F S C L C L P G F T G 0 K C. Q T 0 M N E C ,L S E P O 

2890 2900 2910 2920 2930 2940 2950 2960 2970 
»>»♦»♦♦♦» 
AAGAATGGAG GGACCTGCTC TGACTACGTC AACAGTTACA CTTGCAAGTG CCAGGCAGGA TTTGATGGAC TCCATTGTGA GAACAACATC 
KNG GTCS OYV NSY TCKC QAG FOG VHCE NNI> 

2980 2990 3000 3010 3020 3030 3040 3050 3050 

♦ »»»»*»** 
AATCAGTGCA CTGAGAGCTC CTGTTTCMI GGTGGCACAT GTGTIGATGG GATTAACTCC TTCTCTTGCT TGTGCCCTGT GGGTTTCACT 
NEC T E S S C F N G G T C V 0 G INS F S C L C P V G F T> 

3070 3080 30$. 3100 3110 3120 3130 3140 3150 
» » • • ♦ « » * ♦ 

• GGATCCTTCT GCCTCCATGA GATCAATGAA TGCAGCTCTC ATCCATGCCT GAATGAGGGA ACGTGTGTTG ATGGCCTGGG TACCTACCGC 

G S F C L H E I N E CSS H P C L N E G T C V 0 G L G T Y R> 

3160 3170 . 3180 3190 3200 3210 3220 3230 3240 

♦ i ».»♦♦*» • 
TGCAGCTGCC CCCTGGGCTA CACTGGGAAA AACTGTCAGA CCCTGGTGAA TCTCTGCAGT CGGTCTCCAT GTAAAAACAA AGGTACTTGT 
CSC P L G Y T G K N C 0 T L V N L C S R S P C K N K G T C> 
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3250 3260 3270 3280 3290 3300 3310 3320 3330 
i »*«»»»« » 

GTTC/CAAAA AAGCAGAGTC CCAGTGCCTA TGTCCATCTG GATGGGCTGG TGCCTATTGT GACGTGCCCA ATGTCTCTTG TCACATAGCA 
V Q K K A E S Q C L CPS G W A G A Y C D V P N V S C 0 I A> 

3340 3350 3360 3370 3380 3390 3400 3410 3420 

GCCTCCAGGA GAGGTGTGCT TGTTGAACAC TTGTGCCAGC ACTCAGGTGT CTGCATCAAT GCTGGCAACA CGCATTACTG TCAGTGCCCC 
A S R R G V L V E H L C Q H S G V C I N A G N T H Y C 0 C P> 

3430 3440 3450 3460 3470 3480 3490 3500 3510 
»*♦♦»♦»»♦ 
CTCGGCTATA CTGGGAGCTA CTGTGAGGAG CAACTCGATG AGTGTGCGTC CAACCCCTGC CAGCACGGGG CAACATGCAG TGACTTCATT 
L G Y T G S Y C E E Q L 0 E C A S N P C 0 H G A T C S D F l> 

3520 3530 3540 35SC 3560 3570 3580 3590 3600 
t t »•»«»» * 

GCIGGATACA GATGCGAGTG TCTCCCAGGC TATCAGGGTG TCAACTCTGA GTATGAAGTG GATGAGTGCC AGAATCAGCC CTGCCAGAAT 
GGY RCEC VPC YOG VNCE YEV DEC QNQP CQN> 

3610 3620 3630 3640 3650 3660 3670 3680 3690 
» » » * * * ♦ * * 

GGAGGCACCT GTATTGACCT TGTGAACCAT TTCAAGTGCT CTIGCCCACC AGGCACTCGG GGCCTACTCT GTGAAGAGAA CATTGATGAC 
G G T C I D L V N H F K C S C P P G T R G L L C E E N • I 0 D> 

3700 3710 3720 3730 3740 3750 3760 3770 3780 
i ♦«♦♦♦♦♦ ♦ 

TGTGCCCGGG GTCCCCATTG CCTTAATGGT GGICAGTGCA TGGATAGGAT TGGAGGCTAC AGTTGTCGCT GCTTGCCTGG CTTTGCTGGG 
CAR GPHC LNG GQC MDRI GGY SCR CLPG F A G> 

3790 3800 3810 3820 3830 3840 3850 3860 3870 
$»»«*»»♦♦ 
GAGCGTTGTG AGGGAGACAT CAACGAGIGC CTCTCCAACC CCTGCAGCTC TGAGGGCAGC CTGGACIGTA 1ACAGCTCAC CMTCACTAC 
E R C E G 0 I NEC L S N P C S S E G S I 0 C I Q L T N 0 Y> 

" 3880 3890 3900 3910 3920 3930 3940 3950 3960 
i i « » * « » » » 

CTGTGIGTTT GCCGTAGTGC CTTTACTGGC CGGCACIGTG AAACCTTCGT CGATGTGTGT CCCCAGATGC CCTGCCTGAA TGGAGGGACT 
L C V C R S A F T G R H C E T F V 0 V C POM P C L N G G T> 

3970 3980 3990 4000 4010 4020 4030 4040 4050 
»»«»»»»»» 
TGTGCIGTGG CCAGTAACAT GCCTGATGGT TTCATTTGCC GTTGTCCCCC GGGATTTTCC GGGGCAAGGT GCCAGAGCAG CTGTGGACAA 
C A V A S N M P D G F I C R C P P G F S CAR C Q S S C G 0> 
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4060 4070 4080 4090 4100 4110 4120 4130 4140 

* »•••**•• 
GTGAAATGTA GGAAGGGGGA GCAGTCTGTG CACACCGCCT CTGGACCCCG CIGCTTCTGC CCCAGTCCCC GGGACTGCGA GTCAGGCTGT 

V K C R X G E 0 C V H T A S C P R C F C P S P R 0 C E SCO 

4150 4160 4170 4180 4190 4200 4210 4220 4230 
»***••*«* 
GCCAGIAGCC CCTGCCACCA CGGGGGCAGC IGCCACCCTC AGCGCCAGCC TCCTTAITAC TCCTGCCAGT GTGCCCCACC ATTCTCGGGT 
ASS P C Q H G G S C H P Q R 0 P P Y Y SCO C A P P F S G> 

4240 4250 4260 4270 4280 4290 4300 4310 4320 

********* 

AGCCGCTGTG AACTCTACAC GGCACCCCCC AGCACCCCTC CTGCCACCIG TCTGAGCCAG TATTGTGCCG ACAAAGCTCG GGAIGGCCTC 
SRC ELYT APP STP PATC LSO YCA OKAR 0GV> 

4330 4340 4350 4360 4370 4380 4390 4400 4410 

* * I t I * * * * 

TGTGATGAGG CCTGCAACAG CCATGCCTGC CAGTGGG<*TG GGGGTGACTG TTCTCTCACC ATGGAGAACC CCIGGGCCAA CTGCTCCTCC 
CDE ACNS HAC OWO GGOC SLT MEN PWAN CSS> 

4420 4430 4440 4450 4460 4470 4480 4490 4500 
» * t * « * * * * 

CCACTTCCCT GCTGGGATTA TATCAACAAC CAGTGTGATG AGCTGTGCAA CACGGTOGAG TGCCTGTTTG ACAACTTTGA ATGCCACGGG 
PLP CWOY INN QCO ELCN TVE CLF ONFE CQG> 

4510 4520 4530 4540 4550 4560 4570 4580 4590 

********* 

AACAGCAAGA CATGCAAGTA TGACAAATAC TGTGCAGACC ACTTCAAAGA CAACCACTGT AACCAGGGGT GCAACAGTGA GGAGTGTGGT 
N S K T C K Y D K Y CAD H F K 0 N H C NOG C N S E ECO 

4600 4610 4620 4630 4640 4650 4660 4670 4680 

* * * * I ♦ * * * 

TGGGATGGGC TGGACTGTGC TGCIGACCAA CCTGAGAACC IGGCAGAAGG TACCCIGGTT ATTGTGGTAT TGATGCCACC TGAACAACTG 
W 0 G LOCA ADQ PEN LAEG f L V IVV IHPP EQL> 

-4690 4700 4710 4720 4730 4740 4750 4760 4770 

********* 

CTCCAGGATG CTCGCAGCTT CTTGCGGGCA CTGGGTACCC TGCTCCACAC CAACCTGCGC ATTAAGCGGG ACTCCCAGGG GGAACTCATG 
LOO ARSFLRA LGT LLHT MLR I K R DSQG ELM> 

4780 4790 4800 4810 4820 4830 4840 4850 4860 

********* 

GTGTACCCCI ATTATGGTGA GAAGTCAGCT GCTATGAAGA AACAGAGGAT GACACGCAGA TCCCTTCCTG GTGAACAAGA ACAGGAGGTG 
VYP YYGE KSA A M K KORM IRR SLP GEQE QEV> 
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4870 4880 4890 4900 4910 4920 4930 4940 4950 
««»♦»•» * i 

GCTGGCTCTA AAGTCTTTCT GGAAAITGAC AACCGCCAGT GIGHCAAGA CTCAGACCAC TGCTTCAAGA ACACGGATGC AGCAGCAGCT 
AGS K V f L E I 0 N R Q C V Q 0 S 0 H C F K N T 0 A A A A> 

4960 4970 4980 4990 5000 5010 5020 5030 5040 

J ********* 

CTCCTGGCCT CTCACGCCAT ACAGGGGACC CTGTCATACC CTCTTGTGIC TCTCGTCAGT GAATCCCTGA CTCCAGAACG CACTCAGCTC 
L I A S H A 1 Q G T L S Y P L V S V V S E S L T P E R T 0 L> 

5050 5060 5070 5080 5090 5100 5110 5120 5130 

********* 

CTCTATCTCC TTGCTGTTGC TGTTGTCATC AITCTGTTTA TTATTCTGCT GGGGGTAATC ATGGCAAAAC GAAAGCGTAA GCATGGCTCT 
L Y L LAVA V V I I L F I I L L G V I M A K R K R K H G S> 

5140 5150 5160 5170 5180 5190 5200 5210 5220 

CTCTGGCTGC CTGAAGGTTT CACTCTTCGC CGAGATGCAA GCAATCACAA GCGTOGTGAG CCAGTGGGAC AGGATGCTGT GGGGCTGAAA 
L W L P E G F T L R R 0 A S N H K R R E P V G Q 0 A V G L K> 

5230 5240 5250 5260 5270 5280 5290 5300 5310 

* * I * I I t * I 

AATCTCTCAG TGCAAGTCTC AGAAGCTAAC CTAATTGGTA CTGGAACAAG TGAACACTGG GTCGATGATG AAGGGCGCCA GCCAAAGAAA 
NLS VOVS EAN LIG TGTS E H W V 0 0 EG P Q P K K> 

5320 5330 5340 5350 5360 5370 5380 5390 5400 

* ******* * 

GTAAAGGCTG AAGATGAGGC CTTACTCTCA GAAGAAGATG ACCCCATIGA TCGACGGCCA TGGACACAGC AGCACCTTGA AGCTGCAGAC 
V K A E D E A L L S E E D D P I D R R P W T Q 0 H L E A A D> 

5410 5420 5430 5440 5450 5460 5470 5480 5490 

* ♦♦♦«♦•» « 

ATCCGTAGGA CACCATCGCT GGCTCTCACC CCTCCTCAGG CAGAGCAGGA GGTGGATGTG TTAGATGTGA ATGTCCGTGG CCCAGATGGC 
IRR TPSL A L I PPQAEQE VOV LOV NVRG POO 

'5500 5510 5520 5530 5540 5550 5560 5570 5580 

* ♦ * » » * » t ♦ 

TGCACCCCAT TGATGTTGGC TTCTCTCCGA GGAGGCAGCT CAGATTTGAG TGATGAAGAT GAAGATGCAG AGGACTCTIC TGCTAACATC 
CTP LMLA SLR GGS SOLS OEO EDA EDSS ANI> 

5590 5600 5610 5620 5630 5640 5650 5660 5670 

********* 

ATCACAGACT TGGTCTACCA GGGTGCCAGC CTCCAGGCCC AGACAGACCG GACTGGTGAG ATGGCCCTGC ACCTTGCAGC CCGCTACTCA 
ITO LVYQ GAS L 0 A OIOR IGE MAI H L A A RYS> 
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5680 5690 5700 5710 5720 5730 5740 5750 5760 
» > •(**»» » 

CGCGCTGATG CTGCCAAGCG TCTCCTGGAT GCAGGIGCAG ATGCCAATGC CCAGGACAAC AIGGGCCGCT GTCCACTCCA TGCTGCAGTG 
RAO AAKR LLO AGA 0 A N A QON M G R CPLH AAV> 

; 5770 5780 5790 * 5800 5810 5820 5830 5840 5850 

I « * I * * * * $ 

GCAGCTGATG CCCAAGGTGT CTTCCAGATT CTGATTGGCA ACCGAGTAAC IGATCTAGAT GCCAGGATGA AIGATGGTAC TACACCCCTG 
A A 0 AOGV FQ'I L I R NRVT OLD ARM NDGT TPL> 

5860 5870 5880 5890 5900 5910 5920 5930 5940 

$ I 9 * * * * » I 

A1CCTGGCTG CCOGCCTGGC TGTGGAGGGA ATGGTGGCAG AACTGATCAA CTGCCAAGCG GATGTGAATG CAGTGGATGA CCATGGAAAA 
I L A ARLA VEG M V A ELIN CQA OVN A V D 0 HGK> 

5950 5960 5970 5980 5990 6000 6010 6020 6030 
«♦♦*»♦»« * 

TCTGCTCTTC ACTGGGCAGC TGCTGTCAAT AATGTGGAGG CAACTCTTTT GTTGTTGAAA AATGGGGCCA ACCGAGACAT CCAGGACAAC 
SAL HWAA A V N NVE ATLL LLK NGA NROM QON> 

6040 6050 6060 6070 6080 6090 6100 6110 6120 

• <«»*•*»• 

AAGGAAGAGA CACCTCTGTT TCTTGCTGCC CGGGAGGGGA GCTATGAAGC AGCCAAGATC CTGTTAGACC "ATTTTGCCAA TCGAGACATC 
K E E T P L F L A A REG S Y E A A K 1 LLO H F A N R 0 !> 

6130 6140 6150 6160 6170 6180 6190 6200 6210 
»♦«*»*»» < 

ACAGACCATA TGGATCGTCT TCCCCGGGAT GTGGCTCGGG ATCGCATGCA CCATGACATT GTGCGCCTTC IGGATGAATA CAATGTGACC 
T D H MORI PRO V A R D R M H H 0 i V R L L 0 E Y N V T> 

6220 6230 6240 6250 6260 6270 6280 6290 6300 
»««<•»«« « 

CCAAGCCCTC CAGGCACCGT GTTGACTTCT GCTCTCTCAC CTGICATCTG TGGGCCCAAC AGATCTTTCC TCAGCCTGAA GQOCCCCA 
P S P P G T V L T S A L S P V I C G P N R S F L S L K H T P> 

6310 6320 6340 6350 6360 6370 6380 6390 6400 

t * ! * t I t t t 

ATGGGCAAGA AGTCTAGACG GCCCAGTGCC AAGAGTACCA TGCCTACTAG CCTCCCTAAC CTTGCCAAGG AGGCAAAGGA TGCCAAGGGT 
' MGK KSRR PSA KST MPTS LPN LAK EAKD AKG> 

6400 6410 6420 6430 6440 6450 6460 6470 6480 

* * t * I * t t » 

AGTAGGAGGA AGAAGTCTCT GAGIGACAAG GTCCAACTGT CTGAGAGTTC AGTAACTTTA TCCCCTGTTG ATTCCCTAGA ATCTCCTCAC 
S R R K K S L S E K VOL S E S S V T L S P V 0 S L E S P H> 
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6490 6500 6510 6520 6530 6540 6550 6560 6570 

ACCTATGTTT CCGACACCAC ATCCTCTCCA ATGATTACAT CCCCTGGGAT CTTACAGGCC TCACCCAACC CTATGTTGGC CACTGCCGCC 
T Y V S 0 T T S S P KIT S P G I L 0 A S P N P M L A T A A> 

6580 6590 6600 6610 6620 6630 6640 6650 6660 

♦ ♦♦»♦♦»« t 
CCTCCTGCCC CACTCCATGC CCAGCATGCA CTATCTTTTT CTAACCTTCA TGAAATGCAG CCTTTGGCAC AIGGGGCCAG CACTGTGCTT 
PPA PVHA QHA LSF SNLH EMQ PLA HGAS TVL> 

t 

6670 6680 6690 6700 6710 6720 • 6730 6740 6750 

* ♦ » « » ♦ "-- ♦ * ♦ 

CCCTCAGTGA GCCAGTTGCT ATCCCACCAC CACATTGTGT CTCCAGGCAG TGGCAGTGCT GGAAGCTTGA GTAGGCTCCA TCCAGTCCCA 
P S V S Q L I S H H HIV S P G S - G S A G S L S R L H P V P> 

6760 6770 6780 6790 6800 6810 6820 6830 6840 

$ t * I * * I * t 

GTCCCAGCAG ATTGGATGAA CCGCATGGAG GTGAATGAGA CCCAGTACAA TGAGATGTTT GGTATGGTCC TGGCTCCAGC TGAGGGCACC 
VPA DWMN RUE VNE TQYN E M F GMV LAPA EGT> 

6850 6860 6870 6880 6890 6900 6910 6920 6930 
******* * » 

CATCCTGGCA TAGCTCCCCA GAGCAGGCCA CCTGAAGGGA AGCACATAAC CACCCCTCGG GAGCCCTTGC CCCCCATTGT GACTTTCCAG 
' H P G I A P Q S R P P E G K H I T T P R E f I P P I V T F Q> 

6940 6950 6960 6970 6980 6990 7000 7010 7020 

t * 1 $ $ t I * * 

CTCATCCCTA AAGGCAGTAT TGCCCAACCA GCGGGGGCTC CCCAGCCTCA GTCCACCTGC CCTCCAGCTG TTGCGGGCCC CCTGCCCACC 
LIP K G S I A Q P A G A P 0 P 0 S T C PPA V A G P L P T> 

7030 7040 7050 7060 7070 7080 7090 7100 7110 

• »»»«••«« 

ATGTACCAGA TTCCAGAAAT GGCCCGTTTG CCCAGTGTGG CTTTCCCCAC TGCCATGATG CCCCAGCAGG ACGGGCAGGT AGCTCAGACC 
MYO IPEM ARL PSV AFPT AMM POO OGQV A Q T> 

7120 7130 7140 7150 7160 7170 7180 7190 7200 

♦ «♦»♦♦»«» 

ATTCTCCCAG CCTATCATCC TTTCCCAGCC TCTGTGGGCA AGTACCCCAC ACCCCCTTCA CAGCACAGTT ATGCTTCCTC AMTGCTGCT 
ILP AYHP FPA SVG KYPT PPS OHS YASS N A A> 

7210 7220 7230 7240 7250 7260 7270 7280 7290 
«»«•»•«<« 

GAGCGAACAC CCAGTCACAG TGGTCACCTC CAGGGTGAGC ATCCCTACCT GACACCATCC CCAGAGTCTC CTGACCAGTG GTCAAGITCA 
E R T P S H S G H L Q G E H P Y I IPS PES P 0 Q W S S S> 
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7300 7310 7320 7330 7340 7350 7360 7370 7380 

I I * I » I t t $ 

TCACCCCACT CTGCTTCTGA CTGGTCAGAT GTCACCACCA GCCCTACCCC TGGGGGTGCT GGAGGAGGIC AGCGGGCACC IGGGACACAC 
S P H S A S D W S D V T T S P T P G G A G G G 0 R G P G T H> 

7390 7400 7410 7420 7430 7440 7450 7460 7470 

»»»»»* 4 4 » 

ATGTCrCAGC CACCACACAA CAACAIGCAG GTTTATGCGT GAGAGAGTCC ACCICCAGTG TAGAGACATA ACTGACTTTT GTAAATGCTG 
MSEPPHNMQVYA> 

7480 7490 7500 7510 7520 7530 7540 7550 7560 

CTGAGGAACA AAIGAAGGTC ATCCGGGAGA GAAATGAAGA AATCTCTGGA GCCAGCTTCT AGAGGTAGGA AAGAGAAGAT GTTCTTATTC 

7570 7580 7590 7600 7610 7620 7630 7640 7650 

* * * * * > . ♦ » t 
AGATAATGCA AGAGAAGCAA TTCGTCAGTT TCACTGGGTA TCTGCAAGGC TTATTGATTA TTCTAATCTA ATAAGACAAG TTTGTGGAAA 

7660 7670 7680 7690 7700 7710 7720 7730 7740 

IGCAAGATGA ATACAAGCCI IGGGTCCATG TTTACTCTCT TCTATTTGGA GAATAAGATG CATGCTTATT GAAGCCCAGA CATTCTTGCA 

7750 7760 7770 7780 7790 7800 7810 7820 . 7830 
»♦•»♦»♦» » 

GCTTGGACTG CATTTTAAGC CCTGCAGGCT TCTGCCATAT CCATGAGAAG ATTCTACACT AGCGTCCTGT TGGGAATTAT GCCCTGGAAT 

7840 7850 7860 7870 7880 7890 7900 7910 7920 
»«»«•«»«» 

TCTGCCTGAA TTGACCTACG CATCTCCTCC TCCTTGGACA TTCTTTTGTC TTCATTTGGT GCTTTTGGTT TTGCACCTCT CCGTGATTGT 

7930 7940 7950 7960 7970 7980 7990 8000 8010 
« » « ■ ♦ * ♦ » « » 

AGCCCTACCA GCATGTTATA GGGCAAGACC TTTGTGCTTT TGATCATTCT GGCCCATGAA AGCAACTTTG GTCTCCTTTC CCCTCCTGTC 

-8020 8030 8040 8050 8060 8070 8080 8090 8100 
*•«****!* 

TTCCCGGTAT CCCTTGGAGT CTCACAAGGT TTACTTTGGT ATGGTTCTCA GCACAAACCT TTCAAGTATG TTGTTTCTTT GGAAAATGGA 

8110 8120 8130 8140 8150 8160 8170 8180 8190 
»#»«♦♦* * i 

CATACTGTAT TGTGTTCTCC TGCATATATC ATTCCTGGAG AGAGAAGGGG AGAAGAATAC TTTTCTTCAA CAAATTTTGG GGGCAGGAGA 

8200 8210 8220 8230 8240 8250 8260 8270 8280 

♦ »••♦»«• t 

TCCCTTCAAG AGCCIGCACC TTAATTTTTC TTGTCTGTGT GCAGGTCTTC ATATAAACTT TACCAGGAAG AAGGGTGTGA GTTTGTTGTT 
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8290 8300 8310 8320 8330 8340 8350 8360 8370 
»»»♦»♦»•» 
TTTCIGTGTA TGGGCCTGGT CAGTGTAAAG TTTTATCCTT GATAGTCTAG TTACTATGAC CCTCCCCACT TTTTTAAAAC CAGAAAAAGG 

8380 8390 8400 8410 8420 8430 8440 8450 8460 

♦ ♦«♦♦»»»» 
TTTGGAATGT TGGAATGACC AAGAGACAAG TTAACTCGTG CAAGAGCCAG TTACCCACCC ACAGGTCCCC CTACTTCCTG CCAAGCATTC 

8470 8480 8490 8500 8510 8520 8530 8540 8550 

♦ « ' » ♦ ♦ ♦ « » ♦ 
CATTGACTCC CTGTATGGAA CACATTTGTC CCAGATCTGA GCATTCTAGG CCTGTTTCAC TCACTCACCC AGCATATGAA ACTAGTCTTA 

8560 8570 8580 8590 8600 8610 8620 8630 8640 

♦ * * » * ♦ ♦ » * 
ACTGTTGAGC CTTTCCTTTC ATATCCACAG AAGACACTGT CTCAAATGTT GTACCCTTGC CATTTAGGAC TGAACITTCC TTAGCCCAAG 

8650 8660 8670 8680 8690 8700 8710 8720 8730 
« « ♦ » » « i ♦ « 

GGACCCAGTG ACAGTTGTCT TCCGTTTGTC AGATGATCAG TCTCTACTGA 1TATCTTGCT GCTTAAAGGC CTGCTCACCA ATCITTCTTT 

8740 8750 8760 8770 8780 8790 8800 8810 8820 
»♦»»»♦*»* 

CACACCGTGT GGTCCGTGTT ACTGGTATAC CCAGTATGTT CTCACTGAAG ACATGGACTT TATATGTTCA AGTGCAGGAA TIGGAAAGTT 

8830 8840 8850 8860 8870 8880 8890 8900 8910 
»*»*»»♦»* 
GGACTTGTTT TCTATGATCC AAAACAGCCC TATAAGAAGG TTGGAAAAGG AGGAACTATA TAGCAGCCTT TGCTATTTTC TGCTACCATT 

8920 8930 8940 8950 8960 8970 8980 8990 9000 
i * » » » i « ♦ ♦ 

TCTTTTCCTC TGAAGCGGCC ATGACATTCC CTTTGGCAAC TAACGTAGAA ACTCAAC^A ACATTTTCCT TTCCTAGAGT CACCTTTTAG 

9010 9020 9030 9040 9050 9060 9070 9080 9090 

♦ * « i * * ♦ * * 
ATGATAATGG ACAACTATAG ACTTCCTCAT TGTTCAGACT CATTGCCCCT CACCTGAATC CACTCTCTGT ATTCATGCTC TTGGCAATTT 

9100 9110 9120 9130 9140 9150 9160 9170 9180 
»♦««««•»* 

CTTTGACTTT CTTTTAAGGG CAGAAGCATT TTAGTTAATT CTAGATAAAG AATAGTTTTC TTCCTCTTCT CCTTGGGCCA GTTAATAATT 

9190 9200 9210 9220 9230 9240 9250 9260 9270 

♦ »»»♦♦♦»* 
GGTCCATGGC TACACTGCAA CTTCCGTCCA GTGCTGTGAT GCCCATGACA CCTGCAAAAT AAGTTCTGCC TGGGCATTIT GTAGATATTA 

FIG.17K 
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9280 9290 9300 9310 9320 9330 9340 9350 9360 

• »«»»«»«« 

ACAGGTGAAT TCCCGACTCT TTTGGTTTGA ATGACAGTTC TCATTCCTTC TATGGCTCCA ACTAIGCATC AGTGCITCCC ACITACCTGA 

9370 9380 9390 9400 9410 9420 9430 9440 9450 

♦ ♦»*♦♦♦»♦ 

TTTCTCTGTC GGTGGCCCCA TATGGAAACC CTGCGTGTCT GTTGGCATAA TAGTTTACAA ATGGTTTTTT CAGTCCTATC CAAATTTATT 

9460 9470 9480 9490 9500 9510 9520 9530 9540 
»*»**«*** 

GAACCAACAA AAATAATTAC TTCIGOCCTG AGATAAGCAC ATTAAGTTTG TTCATTCTCT GCITTATTCT CICCATGTGG CAACATTCTG 

9550 9560 9570 9580 9590 9600 9610 9620 9630 

TCAGCCTCTI TCATAGTGTG CAAACATTTT ATCATTCIAA ATGGTCACTC TCTGCCCTTG GACCCATTTA ITATTCACAG ATGGGGAGAA 

9640 9650 9660 9670 9680 9690 9700 9710 9720 
» » » * ♦ ♦ » * » 

CCIATCTGCA TGGACCCTCA CCATGCTCTG TGCAGCACAC ACAGTGCAGG GAGCCAGTGG CGATGGCGAT GACTIICTTC CCCTGGGAAT 

TCC 
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